Ask HN: What AI Software Product/Software Do You Need?

  • > perhaps some inspiration for something to build.

    First off, thank you for stating this explicitly. I get annoyed by the posts clearly fishing for ideas while acting like they have no specific motive for the questions. So I like your transparency!

    Honestly, I have a desire for an AI digital art product that has nothing to do with LLMs, but would instead be akin to what "style transfer" work was heading towards a few years back, but never quite landed where I hoped... what I'd like to see is an AI that can look at multiple artistic works done by the user, and blend them together for new results, but all sourced on that user/artist's original works. Something that lets me say, "I love how the aesthetic of my painting turned out, but I wish the image depicted this photo I took." Style transfer, but personal styles only, and without the "deep mind" artifacts that filled that stuff back in the day.

    I have no idea if people have kept working on those areas or not - I found references to various GANs in more recent years, but they all still seem to suffer from those "deep mind" artifacts that made the initial work interesting but ultimately unusable for creative pursuits.

  • I recently read somewhere that a paper is a small part of a long on-going conversation. I found this description aptly.

    To comprehend the latest paper, I frequently find it necessary to grasp the context provided by its citations. All pertinent papers collectively shape a Directed Acyclic Graph.

    Is there a tool available that would enable me to organize papers in a DAG, allowing me to formulate a structured reading plan? Currently, my PDFs are scattered across different locations, and I essentially have to rely on memory to recall the dependencies between papers.

    I also want to share my organized papers with the same research group.

  • Fractal book summaries: I'd like a tool that allows me to put in a PDF, .epub, or .mobi of a book, and have it output a chapter by chapter summary of varying degrees of summarization. So that then I can read the book in a fractal way. I can start with a one paragraph summary of each chapter, and I can click on anything I want to see more detail of, and it'll be instant. (So, it does all the summarization one off before I start reading)

  • I want something that will let me hook up the output of a llm to a bash terminal and put that terminal output right back into the bash terminal, maybe in a container. I would want to have another prompt that would be for instructing the llm on what its goal is. For example, make a script that prints out an ascii picture of a cat. The llm then gets to work on the bash terminal, using VI or whatever to bang out the script. The the supervisor llm would be able to ask questions or get additional input when it wanted to. I would want this to be sane and not awful to use. Points for free software. Big points for locally hosted.

  • Here's a fun project that you could try. Use TTS to transcribe books but make the transcriptions feel more realistic. Give each character in the book a unique voice. Leading characters should have voices based on their personalities. Use quote extraction and character attributtion to tie characters to lines. Try to do convey the human qualities with EmotionML, SSML, or some kind of semantic analysis.

    The best would be a TTS system at the level of OpenAI's but with voice selection like GCP TTS so you can get quality and a range of voices.

    Copyright would probably spike any monetization effort but you could try. It would be nice to have an open source tool for this though! :)

  • I think RAG are the sweet spot of current tech. I've got a client with a 50 year repository of technical reports and while my memory is great, the organization logic is abysmal.

    Something that can locate files, excerpts, timelines and basic QA from just a point and shoot capacity would kill so many small-medium orgs. it's basically plugnplay search to bright engineers, scientists, technical staff, etc up to speed. add flexibility without having to "train" someone up. basically, bypass ever having to hire interns.

  • Simple youtube thumbnail generator would be great, thank you.

    Not all that complicated junk out there, but just something simple - take photo or two, ask me to describe the thumbnail and text to put on top, that’s it.

    Yet no product is out there that can produce at least mediocre result.

    Which makes me think… isn’t this wave of hype is yet another scam?..

    Anyone remembers the bitcoin 5 years ago?

  • A very smooth, no-BS app that does handwritten text -> Markdown.

    It should do headings, pictures (as local files), and other kinds of formatting as well.

  • An AI that is local and can tap into (either though fine tuning or RAG) the complete context of what I see, hear, think, ingest, and excrete so that I can get a better understanding of who I am and how I can improve. I want to see trends about myself that could only be discovered by an AI that knows more about me than I know myself.

  • Something like the GPTs from openai that I can run locally without being hosted in the cloud

  • I wonder how AI used "for good" would look like. To take all the disinformation online and point out the contradictions and things that do not make sense in a clear way.

    Of course, it remains to be seen if people would be convinced or look for their already stablished opinions, etc. Still, cynicism aside, I wonder how something that balances that toxicity using AI would look like. Maybe reducing news to plain facts, like news wire services?

    You could train on the comments of major newspapers.

  • A native macOS app that uses local LLMs for writing in any application, think ehanced autocomplete / tab completion etc... 100% local and no Electron.

  • Markup-friendly spell-checking that doesn't suck. It should spell-check comments in c code, but not syntax. It should work well with emacs.

  • iOS/Android keyboard that is context aware and can correct your typos based on context.