Uses for local AI?
from smeeps@lemmy.mtate.me.uk to selfhosted@lemmy.world on 24 Jul 2024 14:34
https://lemmy.mtate.me.uk/post/7472

Im using Ollama on my server with the WebUI. It has no GPU so its not quick to reply but not too slow either.

Im thinking about removing the VM as i just dont use it, are there any good uses or integrations into other apps that might convince me to keep it?

#selfhosted

threaded - newest

just_another_person@lemmy.world on 24 Jul 2024 14:51 next collapse

None

RandomLegend@lemmy.dbzer0.com on 24 Jul 2024 15:07 next collapse

It’s a tool like any other. If you don’t have any usecase for it, just don’t use it.

I use it to summarize release notes and generate some minor descriptions for generic stuff in my TTRPG campaigns.

DrinkMonkey@lemmy.ca on 24 Jul 2024 18:25 collapse

generate some minor descriptions for generic stuff in my TTRPG campaigns.

Need a quick 200 word description of the interior of an apothecary? Or a band of marauding orcs? It’s been a huge time saver for me.

RandomLegend@lemmy.dbzer0.com on 24 Jul 2024 18:48 collapse

Yup, never had to usw “Random NPC Merchant No. 14” again.

slazer2au@lemmy.world on 24 Jul 2024 15:09 next collapse

Wanting answers to things you don’t want google to know that you don’t know.

dwindling7373@feddit.it on 24 Jul 2024 15:36 collapse

There are a huge number of vastly better solutions to get that…

Aquila@sh.itjust.works on 24 Jul 2024 15:56 next collapse

Such as…?

dwindling7373@feddit.it on 24 Jul 2024 17:22 next collapse

A privacy respecting search engine.

AustralianSimon@lemmy.world on 25 Jul 2024 02:16 collapse

Duckduckgo or SearX

umami_wasbi@lemmy.ml on 26 Jul 2024 06:28 collapse

IMO LLMs are ok to get a head start of searching. Like got a vague idea of something but don’t know the exact keywords. LLMs can help and use the output on whatever search engine you like. This saves a lots of time tinkering the right keywords.

dwindling7373@feddit.it on 26 Jul 2024 08:30 collapse

Sure, or you could send an email to the leading international institution on the matter to get a very accurate answer!

Is it the most reasonable course of action? No. Is it more reasonable than waste a gazillion Watt so you can maybe get some better keywords to then paste in a search engine? Yes.

kitnaht@lemmy.world on 27 Jul 2024 22:45 collapse

Once the model is trained, the electricity that it uses is trivial. LLMs can run on a local GPU. So you’re completely wrong.

dwindling7373@feddit.it on 28 Jul 2024 06:53 collapse

No I’m not. Other questions?

kitnaht@lemmy.world on 28 Jul 2024 21:52 collapse

Those were statements. Statements of fact.

Once the models are already trained, it takes almost no power to use them.

Yes, TRAINING the models uses an immense amount of power - but utilizing the training datasets locally consumes almost nothing. I can run the llama 7b set on a 15w Raspberry Pi for example. Just leaving my PC on uses 400w. This is all local – Nothing entering or leaving the Pi. No communication to an external server, nothing being done on anybody else’s server or any AWS instances, etc.

dwindling7373@feddit.it on 29 Jul 2024 06:47 next collapse

Notwithstanding that running an LLM is still more expensive than a search engine, in any reasoning around running an LLM you must include the training and, most of all, the incentive as a consumer you are giving to further training.

It’s like arguing that cooking a steak has negligible environmental impact. The point is the whole industry meant to provide you the steak in the first place.

dwindling7373@feddit.it on 29 Jul 2024 06:47 collapse

Notwithstanding that running an LLM is still more expensive than a search engine, in any reasoning around running an LLM you must include the training and, most of all, the incentive as a consumer you are giving to further training.

It’s like arguing that cooking a steak has negligible environmental impact. The point is the whole industry meant to provide you the steak in the first place.

pe1uca@lemmy.pe1uca.dev on 24 Jul 2024 15:20 next collapse

I’ve used it to summarize long articles, news posts, or videos when the title/thumbnail looks interesting but I’m not sure if it’s worth the 10+ minutes to read/watch.
There are other solutions, like a dedicated summarizer, but I’ve investigated into them and they only extract exact quotes from the original text, an LLM can also paraphrase making the summary a bit more informative IMO.
(For example, one article mentioned a quote from an expert talking about a company, the summarizer only extracted the quote and the flow of the summary made me believe the company said it, but the LLM properly stated the quote came from the expert)

This project github.com/goniszewski/grimoire has in it’s road map a way to connect to an AI to summarize the bookmarks you make and generate at 3 tags.
I’ve seen the code, I don’t remember what the exact status of the integration.


Also I have a few models dedicated for coding, so I’ve also asked a few pieces of code and configurations to just get started on a project, nothing too complicated.

VeryNiiiice@sh.itjust.works on 24 Jul 2024 17:06 collapse

Which one do you use to summerize videos?

AnUnusualRelic@lemmy.world on 25 Jul 2024 12:10 next collapse

Does it work with porn videos?

maniel@sopuli.xyz on 26 Jul 2024 20:52 collapse

asking the important question, but yeah, the plot is essential in porn

pe1uca@lemmy.pe1uca.dev on 28 Jul 2024 20:12 collapse

Well, it’s a bit of a pipeline, I use a custom project to have an API to be able to send files or urls to summarize videos.
With yt-dlp I can get the video and transcribe it with fast whisper (github.com/SYSTRAN/faster-whisper), then the transcription is sent to the LLM to actually make the summary.

I’ve been meaning to publish the code, but it’s embedded in a personal project, so I need to take the time to isolate it '^_^

WeLoveCastingSpellz@lemmy.dbzer0.com on 24 Jul 2024 15:20 next collapse

playing dnd alone is pretty cool

badcommandorfilename@lemmy.world on 24 Jul 2024 15:31 next collapse

“cool”

RandomLegend@lemmy.dbzer0.com on 24 Jul 2024 15:32 collapse

Any model recommendation for that?

The ones i tried get stuck in a loop at some point due to the small context windows.

1rre@discuss.tchncs.de on 24 Jul 2024 16:12 next collapse

Yeah even gpt4o couldn’t keep track of encounters, run battles etc. in my case…

I think if you wanted to do it mechanically consistently you’d probably need to integrate it into a vtt where you give it context and potentially fine-tune it to give quest related summaries & gming rather than just “stuff”

RandomLegend@lemmy.dbzer0.com on 24 Jul 2024 16:35 next collapse

VTT integration would be one hell of a job to do.

Bluesheep@lemmy.world on 26 Jul 2024 06:23 collapse

I don’t know how tech savvy you are, but I’m assuming since your on lemmy it’s pretty good :)

The way we’ve solved this sort of problem in the office is by using the LLM’s JSON response, and a prompt that essentially keeps a set of JSON objects alongside the actual chat response.

In the DND example, this would be a set character sheets that get returned every response but only changed when the narrative changes them. More expensive, and needing a larger context window, but reasonably effective.

WeLoveCastingSpellz@lemmy.dbzer0.com on 24 Jul 2024 16:38 collapse

the answer is very spesific to ur pc and amount of vram you have availşble to you. But anything lama 3 even 8b models finetuned to DM or write stories should theoritically work. The other reply that reccomends connecting to another program to make sure rules are consistent sounds like a great idea whşch I have not tried. I use silly tavern as the ui whşch has lots of options and shit to mske thşngs wkrk well. I would reccomend goşng şnto the “KoboldAI” discord and askşng şn the support sectşon folk there are very helpfull sorry for not beşng able to gşve a strsight answer Also boost the context size way up that shit makes dşfference I habe like 16k or sumthin. good luck!

RandomLegend@lemmy.dbzer0.com on 24 Jul 2024 16:45 collapse

What on earth is going on with your keyboad?!

Besides that, i have 20GB of VRAM and 64GB or RAM. I can run the mixtral 8x7b model relatively usable. Currently i use oobabooga the most.

WeLoveCastingSpellz@lemmy.dbzer0.com on 24 Jul 2024 17:35 collapse

I type very poorly on my phone. with that much vram ypu csn get somethşng lşke a 70b model defineyly ask around in the koboldai community that shşt’s crszy

minnix@lemux.minnix.dev on 24 Jul 2024 15:48 next collapse

Ollama without a GPU is pretty useless unless you’re using with Apple silicon. I’d just get rid of it until you get a GPU.

possiblylinux127@lemmy.zip on 24 Jul 2024 17:36 next collapse

I have never tested in on Apple silicon but it works fine on my laptop

minnix@lemux.minnix.dev on 24 Jul 2024 20:56 collapse

What are your laptop specs?

possiblylinux127@lemmy.zip on 24 Jul 2024 22:35 collapse

Intel 12th gen i5

minnix@lemux.minnix.dev on 24 Jul 2024 23:05 collapse

CPU is only one factor regarding specs, a small one at that. What kind of t/s performance are you getting with a standard 13B model?

possiblylinux127@lemmy.zip on 24 Jul 2024 23:16 collapse

I don’t have enough ram to run a 13b. I just stick to Mistral 7b and it works fine.

smeeps@lemmy.mtate.me.uk on 24 Jul 2024 21:33 collapse

Works fine on an 11th Gen i5. Not fast but not slow

yesman@lemmy.world on 24 Jul 2024 18:32 next collapse

Think of LLMs like a stupid office worker. You wouldn’t rely on them to make critical decisions, but they’re valuable for tedious stuff.

For example, my calendar changed the way to enter new events breaking my workflow. Now I just type out a skeletal schedule and have LLM convert that into a .csv that I import.

I’m thinking of Ripping my CD collection again. I’m researching a way to use a LLM to tidy up the metadata.

I had a folder full of random stuff I’ve saved for years. Had a LLM organize and categorize it for me. I had to tweak the prompt enough that this was a medium difficulty task, but still way easier than doing it manually.

domi@lemmy.secnd.me on 25 Jul 2024 13:44 next collapse

I’m thinking of Ripping my CD collection again. I’m researching a way to use a LLM to tidy up the metadata.

If you ever figure out how to use AI to determine the genre(s) of a song, let me know. Have been looking for something like that for quite a while.

BuccaneerScientist@discuss.tchncs.de on 02 Aug 2024 07:39 collapse

Nextcloud Recognize is supposed to do that, but I haven’t tried it. You might try looking down that road.

domi@lemmy.secnd.me on 02 Aug 2024 08:09 collapse

Thanks for the tip! I took a look and it seems like Recognize uses this: github.com/jordipons/musicnn

Last update was 4 years ago but will give it a try this weekend.

miau@lemmy.sdf.org on 25 Jul 2024 18:42 next collapse

Can you share some info on how you did that folder organization? Did you provide the AI with a list of files?

andreas@lemmy.kfed.org on 26 Jul 2024 06:43 collapse

for the metadata, LLMs may not prove so great. Use MusicBrainz Picard or Beets

Banthex@feddit.org on 24 Jul 2024 19:07 next collapse

github.com/…/paperless_sort_low_quality_ollama let ai tag your paperless ngx files base on content.

bizarroland@fedia.io on 24 Jul 2024 20:01 next collapse

I have a 4070 sitting around collecting dust that I got from a trade, I've been thinking about setting it up with whispr and TTS and having a way to talk to my house.

I have a couple of smart home integrations, mostly air conditioning, light switches, security, and doors.

What I would like would be to have a few speakers on the walls that can talk to my server where I can say something like, hey computer, turn on the lights in the dining room and the lights in the dining room would turn on without transmitting that information to Google or Amazon.

umami_wasbi@lemmy.ml on 24 Jul 2024 22:03 next collapse

You can try integrating with SEPIA. Not that I used it befote but it surely looks promising.

bizarroland@fedia.io on 25 Jul 2024 01:06 collapse

Wonderful. I'll check it out. Thank you!

possiblylinux127@lemmy.zip on 26 Jul 2024 15:32 collapse

I am really curious if you can get the traditional smart functionality along with a LLM. Maybe have some sort of keyword the prompts the AI. You also could write a custom generated system prompt that includes the weather, time and any other information

[deleted] on 24 Jul 2024 21:09 next collapse
.
thirdBreakfast@lemmy.world on 24 Jul 2024 22:26 next collapse

I use the Continue VS Code plugin with Ollama to use a couple of different models (deepseek-coder-v2 & starcoder2) to recreate a local only Github Copilot type experience for coding. This is on an M1 Apple Silicon though. For autocomplete the generation needs to be pretty brisk - I’m not sure how that would go in a VM without a GPU.

Amongussussyballs100@sh.itjust.works on 27 Jul 2024 02:39 collapse

How well does the M1 chip keep up? What size models are you running with it? Interested in getting an M1 laptop and I am curious.

thirdBreakfast@lemmy.world on 27 Jul 2024 06:37 collapse
starcoder2:latest       	f67ae0f64584	1.7 GB	3 days ago 	
phi3:latest             	d184c916657e	2.2 GB	3 weeks ago	
deepseek-coder-v2:latest	8577f96d693e	8.9 GB	3 weeks ago	
llama3:8b-instruct-q8_0 	1b8e49cece7f	8.5 GB	3 weeks ago	
dolphin-mistral:latest  	5dc8c5a2be65	4.1 GB	3 weeks ago	
codeqwen:latest         	df352abf55b1	4.2 GB	3 weeks ago	
llama3:latest           	365c0bd3c000	4.7 GB	4 weeks ago

I mostly use starcoder2 with Continue for code autocomplete, the big deepseek coder is a bit slow (I can feel it thinking), but it and the regular llama3 are good for chatbot type programming questions.

I don’t really have anything to compare the M1 performance to. I guess the 8GB models output text a little slower than the web versions of the same models, and the 4GB ones about the same. Using ollama in the terminal, there’s sometimes a 0.5-2 second pause before it starts outputting. Not with phi3 though - it’s surprisingly snappy for the quality of answers.

hendrik@palaver.p3x.de on 25 Jul 2024 06:52 next collapse

Roleplay (text adventures), a (stupid but occasionally funny) dungeon master, translation and help with creativity. These are the use cases I found. If you don't need that, you might get rid of it.

andreas@lemmy.kfed.org on 26 Jul 2024 06:40 next collapse

I use local AI for coding (more recently) and ML Photo storage facial recognition and security camera object detection (been using the later 2 for years now actually, don’t want that kind of info out on someone else’s cloud training on my images)

possiblylinux127@lemmy.zip on 26 Jul 2024 15:30 collapse

I use it for everything. I did move to ollama to podman and I replaced webUI with Alpaca