Think I am getting carried away with self hosting (stock image library)
from shellington@piefed.zip to selfhosted@lemmy.world on 14 Jun 15:06
https://piefed.zip/c/selfhosted@piefed.zip/p/1575920/think-i-am-getting-carried-away-with-self-hosting-stock-image-library

I have nearly every service imaginable running and have now started a new project.

I am creating a searchable stock photo archive for my lan. It has been a very interesting project but think i may have crossed the line into overkill lol.

I had hundreds of stock photo cds from the 90s I have turned them all into ISO’s.

I then spent ages dealing with some strange cdrom layouts but got all the images off.

I then converted them all to JPG.

I have now setup a batch script that dedupes then takes the images in 2k batches, runs them through a ai vision model to add keywords and descriptions; as they have none.

They are then copied to a folder where I have photoprism running as the front end and I only have 4k done so far but they look amazing and the search and descriptions are really accurate and useful.

400k more images to go but at least it should all be automated now.

#selfhosted

threaded - newest

Kirk@startrek.website on 14 Jun 15:09 next collapse

Careful, once it’s automated you won’t be able to work on it anymore!

shellington@piefed.zip on 14 Jun 15:12 collapse

This feeds into my ultimate project which I think will take me all summer.

I plan to create a lan wide search that has kiwix, tube archivist, ubooquity, paperless, jellyfin, stock images and stash.

Then I would have a unified search point and i think it would make kiwix far more usable by not having to go to the specific zim first. BBut, it is a tricky project as some things are nice and have api’s others don’t

irmadlad@lemmy.world on 14 Jun 15:11 next collapse

Are you a web designer, or how do you utilize 400k+ of stock images?

shellington@piefed.zip on 14 Jun 15:14 collapse

No i’m not just kind of thought it would be nice to have, that way if i ever need to make any cards or banners i have loads of stuff in every category, with no ugly AI pics like most search engine image searches show these days.

irmadlad@lemmy.world on 14 Jun 15:46 collapse

Well, that’s pretty cool.

lemongarlic@lemmy.world on 14 Jun 15:19 next collapse

You can use Immich, it’s not perfect for this use case but it is searchable by content

shellington@piefed.zip on 14 Jun 15:22 collapse

Might be worth a try.

The thing that has surprised me the most is how good this AI model is at accurately knowing what is in a picture when the model itself is only 3GB in size.

state_electrician@discuss.tchncs.de on 14 Jun 15:36 next collapse

How do you do the AI tagging? That is something I need right now.

shellington@piefed.zip on 14 Jun 15:48 collapse

running the images through ollama using this model: gemma3:latest

It is working on cpu only and still gives a decent throughput

state_electrician@discuss.tchncs.de on 14 Jun 16:20 collapse

Cool, thanks. I’ll look into it.

IratePirate@feddit.org on 14 Jun 15:40 next collapse

That’s awesome! But… why?

shellington@piefed.zip on 14 Jun 15:50 collapse

kind of getting ready for the day when the open web becomes pretty much unusable due to ai and id requirements

IratePirate@feddit.org on 14 Jun 16:19 collapse

And when the slopocalypse and technofascism have blown over, we crawl out of our digital bunkers and repopulate the wastelands of cyberspace with… stock images of Hide the Pain Harold? I love it! 😄 Although I could think of more valuable data to hoard for that event.

motruck@lemmy.zip on 14 Jun 16:07 next collapse

Any chance you can upload your ISOs to archive.org?

Analog@lemmy.ml on 14 Jun 16:39 collapse

This is awesome!

Personally I would have used TIFF and either Immich or ResourceSpace (a DAM - meant for this kind of thing, but also maybe more institutional than you want.)

non_burglar@lemmy.world on 14 Jun 16:45 collapse

Personally I would have used TIFF

Damn, unlimited storage? In this economy?