Go to main content

Textpattern CMS support forum

You are not logged in. Register | Login | Help

#13 2021-03-27 12:31:48

phiw13
Plugin Author
From: Japan
Registered: 2004-02-27
Posts: 3,208
Website

Re: Images and the file system layout from 4.9.0

So many things to think about before formulating something of an answer.

Two notes maybe

  • predictable, relatively easy to understand URL/location/folder structure. This is maybe more important for people who use FTP frequently. I am a little uncomfortable —at first sight— with the hash idea, although I haven’t thought about it much yet.
  • possibility of working with filename instead of image ID. I think you mentioned this somewhere offhand, or did I dream it ?

I am still very unhappy with using PHP + GD or ImageMagic to generate images, especially for the type of images I often deal with (recently: dozens and dozens of screenshots of UI stuff).


Where is that emoji for a solar powered submarine when you need it ?
Sand space – admin theme for Textpattern

Offline

#14 2021-03-27 13:21:11

Vienuolis
Member
From: Vilnius, Lithuania
Registered: 2009-06-14
Posts: 310
Website GitHub GitLab Mastodon Twitter

Re: Images and the file system layout from 4.9.0

Bloke wrote #329494:

Would we be better ditching ids and using filenames, despite the potential filename clashes. And if you rename/replace a file, what then?

I see only one robust image filenaming method: by adding a volatile image filename as an alias to its stable canonical ID — leaving and not renaming it.

For an example, there are links to my picture on Open Drive:

Both URLs:
https://od.lk/s/146878367_mWLy3 and https://od.lk/s/146878367_mWLy3/audits.png
should hardlink to the same picture (sorry, by another URL on od.lk → opendrive.com, I do not have the pure example for that).

On our system, the appropriate links could look as:

https://textpattern.com/images/123.png =
https://textpattern.com/images/123name.png
if it possible at all.

Offline

#15 2021-03-27 13:39:21

Bloke
Developer
From: Leeds, UK
Registered: 2006-01-29
Posts: 11,456
Website GitHub

Re: Images and the file system layout from 4.9.0

phiw13 wrote #329511:

I am a little uncomfortable —at first sight— with the hash idea

Yeah. It’s not ideal. I’m open to better ideas. One thing we could consider is to use the id split up. So, I dunno…

/public/images/1/23/123.jpg
/public/images/1/23/1230.jpg

Not sure how that scales or if it’s a lumpy distribution. And what happens with id values less than three digits.

possibility of working with filename instead of image ID.

Sure, if we can. ID values are attractive because they’re multi user compliant and easier to administer when names might change. But I do like names if we can come up with some suitable scheme.

I am still very unhappy with using PHP + GD or ImageMagic to generate images

Noted. You mean in general, from a quality standpoint? Or performance?

I’m conscious to make it easy to create/replace images with your own. Need to give this more thought.


The smd plugin menagerie — for when you need one more gribble of power from Textpattern. Bleeding-edge code available on GitHub.

Txp Builders – finely-crafted code, design and Txp

Offline

#16 2021-03-27 13:56:18

etc
Developer
Registered: 2010-11-11
Posts: 5,210
Website GitHub

Re: Images and the file system layout from 4.9.0

Bloke wrote #329513:

One thing we could consider is to use the id split up. So, I dunno…

/public/images/1/23/123.jpg
/public/images/1/23/1230.jpg

I’m not really fond of it, since its logic is artificial. But what’s wrong with date-based directories? How many images a person could upload per day?

Offline

#17 2021-03-27 14:02:27

Bloke
Developer
From: Leeds, UK
Registered: 2006-01-29
Posts: 11,456
Website GitHub

Re: Images and the file system layout from 4.9.0

Vienuolis wrote #329512:

I see only one robust image filenaming method: by adding a volatile image filename as an alias to its stable canonical ID — leaving and not renaming it.

This has merit. We can use server rules to skip the name (maniqui did this back in the day) but we need to consider Nginx too.

Any ideas on how to implement it at a tech file system php level, gratefully appreciated.


The smd plugin menagerie — for when you need one more gribble of power from Textpattern. Bleeding-edge code available on GitHub.

Txp Builders – finely-crafted code, design and Txp

Offline

#18 2021-03-27 14:22:50

Bloke
Developer
From: Leeds, UK
Registered: 2006-01-29
Posts: 11,456
Website GitHub

Re: Images and the file system layout from 4.9.0

etc wrote #329514:

I’m not really fond of it, since its logic is artificial.

Me neither.

what’s wrong with date-based directories? How many images a person could upload per day?

Nothing. It’s not a question of volume. It’s being able to find the damn images in five years when you need to move some around. Or if you want to manually intervene and replace a few.

And how, programmatically, do we find the image? If we use the upload datestamp (aka ‘now’) and stash that in the database metadata so we can find the image, I guess that’s fine. But you still need to look up the date stored to find the file. With the id (or potentially its name) it’s atomic. If you know the id you can find where it lives without the database.

If we use the image’s datestamp, that might help if people are really good at organising their pics using (e.g) exif or iptc metadata or are good at housekeeping images filed logically. Not so good for casual photos from a phone.

Plus if you later replace an image and the new one has a different date stamp, do we move the image to a new subdir to reflect its new stamp? Thus breaking direct URLs. Or leave it in its original location so the datestamp of the yyyy/mm/dd subdir doesn’t match the files it contains?

I’m not married to any system. I’m trying to use this discussion to find the best way to store images so:

  • humans can easily find them if necessary.
  • the system can find them with minimal extra hoops/info.
  • the system scales and is reasonably well distributed so the number of files per dir remains manageable as the number of images grows.
  • files that are related – usually those of different sizes – are close together so they can be easily operated upon by hand (e.g. if you want to replace a bunch with your own versions via FTP, it should be easy to do so).
  • it’s performant.

Lots to balance. There’s no perfect solution. Just need the best compromise for all of the above.


The smd plugin menagerie — for when you need one more gribble of power from Textpattern. Bleeding-edge code available on GitHub.

Txp Builders – finely-crafted code, design and Txp

Offline

#19 2021-03-27 17:51:12

colak
Admin
From: Cyprus
Registered: 2004-11-20
Posts: 9,091
Website GitHub Mastodon Twitter

Re: Images and the file system layout from 4.9.0

Bloke wrote #329516:

  • files that are related – usually those of different sizes – are close together so they can be easily operated upon by hand (e.g. if you want to replace a bunch with your own versions via FTP, it should be easy to do so).

Does this mean that the buttons to replace the thumbs (Browse, Reset, Upload) will be removed?

Last edited by colak (2021-03-27 17:52:05)


Yiannis
——————————
NeMe | hblack.art | EMAP | A Sea change | Toolkit of Care
I do my best editing after I click on the submit button.

Offline

#20 2021-03-27 18:39:38

Bloke
Developer
From: Leeds, UK
Registered: 2006-01-29
Posts: 11,456
Website GitHub

Re: Images and the file system layout from 4.9.0

colak wrote #329528:

Does this mean that the buttons to replace the thumbs (Browse, Reset, Upload) will be removed?

Yes. Well, probably. Because it makes better use of screen real estate with an unknown number of thumbs if you have a dropdown of all available sizes for the current image. Pick one, it’s loaded in and you can operate on just that thumb.

  • If you use the upload/replace tool on a thumb, then just that thumb is replaced.
  • If you select the working copy (aka main) image then I was thinking you’d get an additional checkbox when you replace it, to offer the option to “recreate all sizes on upload?”
  • If you select the original size image from the dropdown and choose to replace that, all images in the set will be recreated automatically because that’s the controlling picture for this ID. So if you donkey with that, you are doing it for a very good reason. A warning that appears alongside the upload box could help make this clear.

One thing I haven’t figured out yet is whether to allow cropping and color correction tools on individual images. Part of me says why not. You might have a massive picture at full res and want to zoom in on a portion of it for the smaller image so the subject isn’t too tiny when shrunk – think art direction.

You’d have to have some way to apply the changes to just that image. That’s easy: an Apply button. Incidentally, we’re planning to offer undo states. I’ve mostly figured that out.

But another part of me thinks that if you want to do art direction in a <picture> tag then you could just upload a second image (different id) and switch to it at whatever res you want in your tag.

That means you can only operate (crop/rotate/colorize/etc) on the working (main) image and the tools are hidden when you select other sizes. And a checkbox near the Apply button could allow you to push the artistic changes to all other thumbs. Or maybe to a chosen subset of thumbs in case you’ve manually replaced one and want it to be skipped.

I don’t want to offer the kitchen sink here. People who take this seriously will likely pre-process their images offline anyway and upload the various sizes by hand, either via FTP or through the interface via the upload/replace tool on the Image Edit panel. But it’d be nice to offer rudimentary control to admins to perform oft-used tools to tinker with the image. If we restrict that to the main image for simplicity and force your thumbs to always remain in sync with it, then I’m fine with that.

That was my thinking anyway. If anyone has better ideas, please speak up.


The smd plugin menagerie — for when you need one more gribble of power from Textpattern. Bleeding-edge code available on GitHub.

Txp Builders – finely-crafted code, design and Txp

Offline

#21 2021-03-27 20:33:42

etc
Developer
Registered: 2010-11-11
Posts: 5,210
Website GitHub

Re: Images and the file system layout from 4.9.0

Bloke wrote #329516:

It’s not a question of volume.

Ah, ok, then I have misunderstood the issue, never had to deal with an image-intensive site.

And how, programmatically, do we find the image? … With the id (or potentially its name) it’s atomic. If you know the id you can find where it lives without the database.

Do we often need to retrieve only image URL? Generally, images are output through <txp:images /> tag, and the latter queries db anyway to retrieve the author, the description and so on.

Offline

#22 2021-03-27 21:59:52

Bloke
Developer
From: Leeds, UK
Registered: 2006-01-29
Posts: 11,456
Website GitHub

Re: Images and the file system layout from 4.9.0

etc wrote #329531:

Do we often need to retrieve only image URL?

For the front-end, you’re right that most access is through tags. But the admin-side relies (and will rely moreso in 4.9) on fast access to the images – either URL or path – so we can switch image from the UI and operate on them. During the act of operating on the files, we’ll create temp images and then write those back to replace the old version when Apply is tapped.

It’s not too much hardship on the Image Edit panel to look up the images and metadata (on page load, as normal) and return all their paths, then stuff those in some JS variables.

Say you uploaded a pic in 2018 and it’s stashed in 2018/08/25/42_original.jpg (plus 42_1920x1440.jpg and 42_240x240.jpg). Fast forward to today and you want to add a couple more thumbs by hand.

You click ‘replace’ and select a couple of thumbnails. They’ll be given timestamps of ‘now’ and should, by rights be stored in 2021/03/27/42_800x600.jpg and 42_400x300.jpg. But if we do that, they’re separated from the original set, which means if you want to find the related pics, you need to look in two places.

If, however, we store them in the existing directory, we not only have to write extra code to detect this on upload/replace to divert them away from their ‘normal’ location of “today” but also the timestamp of the files don’t match the directory for those two image sizes. That may not be an issue. But it might.

I’m trying to figure out some distribution strategy based on something immutable, and it seems as if the ID is the perfect piece of data. Everything else about a file could change if you upload it or mess about with its metadata, but once the ID is set for the image, that’s it. It’s unique and therefore if we can key the set of files off that to derive their location, not only is it more determinant for us (programmatically) it’s also more determinant for people who want to import a truckload of images from another CMS or external system.

They can very easily rename their images to a sequential set, and use the same algorithm that core uses to mimic what Txp’s directory structure will be, thus pre-populating that structure. Stuff that on the server, then all we need to have is some way of core to link the files to the DB. And if the metadata has been exported to (or constructed in) a companion file (e.g. XML) it’s a 5-line plugin to call TxpXML to iterate over it.

If the filesystem uses dates, people won’t be able to do that. They won’t know which date Txp is going to use: The file creation date? Its modified date? The date the file is uploaded? The server date or local system date?

I really wanted dates to work because they solve the distribution thing nicely. It’s what I had originally planned. But the more I thought about it and tried examples and scenarios, the less enthusiastic I became about their long-term applicability.

I don’t like hashes either particularly as it’s still a layer of indirection, but I can’t think of anything better that gives us a decent spread of images, allows us to keep related images together, is scalable, immutable, determinant, fast to compute, performant in remote browsing (FTP) situations, and permits relatively simple offline preparation of content for bulk upload.

I’m all ears/eyes if someone can come up with something better.

Last edited by Bloke (2021-03-27 22:08:42)


The smd plugin menagerie — for when you need one more gribble of power from Textpattern. Bleeding-edge code available on GitHub.

Txp Builders – finely-crafted code, design and Txp

Offline

#23 2021-03-28 06:05:44

colak
Admin
From: Cyprus
Registered: 2004-11-20
Posts: 9,091
Website GitHub Mastodon Twitter

Re: Images and the file system layout from 4.9.0

colak wrote #329528:

Does this mean that the buttons to replace the thumbs (Browse, Reset, Upload) will be removed?

Bloke wrote #329529:

Yes. Well, probably. Because it makes better use of screen real estate with an unknown number of thumbs if you have a dropdown of all available sizes for the current image. Pick one, it’s loaded in and you can operate on just that thumb.

Sorry to be a pest, but with operate on just that thumb, you mean using the web interface where the Browse, Reset, Upload button will be available.

I am thinking of people who have clients, and the online interface would be very important.

That means you can only operate (crop/rotate/colorize/etc) on the working (main) image and the tools are hidden when you select other sizes. And a checkbox near the Apply button could allow you to push the artistic changes to all other thumbs. Or maybe to a chosen subset of thumbs in case you’ve manually replaced one and want it to be skipped.

Having the tools available for each size/version of the image would be excellent for the picture tag.


Yiannis
——————————
NeMe | hblack.art | EMAP | A Sea change | Toolkit of Care
I do my best editing after I click on the submit button.

Offline

#24 2021-03-28 08:20:46

philwareham
Core designer
From: Haslemere, Surrey, UK
Registered: 2009-06-11
Posts: 3,564
Website GitHub Mastodon

Re: Images and the file system layout from 4.9.0

Having tools available for every size of every images seems way too complicated to me. My vision was for tools on appropriate copy of original image then generate sizes of that image. If a user wants to override one of the sized images in that flow they can manually upload via a replace image button.

Otherwise we are talking temp images for potentially loads of work files and an unwieldy file system.

Offline

Board footer

Powered by FluxBB