Hacker News

GPT-4 is phenomenal at Code

by sualehasifon 3/14/2023, 10:56:09 PM with 33 comments

by LightMachineon 3/14/2023, 11:49:01 PM
It managed to write, in one shot, a working λ-calculus parser, using a very specific programming style I asked it in JavaScript, and then translated it all to Python, including sarcastic, rhyming GLaDOS comments.
https://twitter.com/VictorTaelin/status/1635726202231988225
It also seems to be extremely competent at writing Agda types and proofs. We need some tool that highly integrates it with entire codebases, allowing for major refactors to be requested as a prompt. That would be groundbreaking (understatement of the year), specially if coupled with a strongly typed language that prevents it from accidentally breaking things up.
by nottathrowaway3on 3/15/2023, 12:02:36 AM
> is phenomenal at Code
Can it
- attend bullshit meetings
- lick my manager's boot
- tell them that their ideas are good but in the masochistic, self-deprecating way they like to hear it from me in
Not yet? Darn it, guess my job is safe for now.
by moonchromeon 3/15/2023, 12:01:14 AM
I'm waiting for copilot that has access to something like language server and instead of just suggesting immediate completions and using open tabs for context it can use LS to pull references for context, suggest file creation, adding code in non trivial places. Then add a chat interface on top. Suddenly I'm really 10x - I would pay hundreds of dollars per month for this if it worked well - it would let me handle at least two or three IC jobs. Next level after that would be getting access to tasks/development board and automatically having that as context.
Copilot and chatgpt are so clunky to work with - they help but it's not groundbreaking - once the workflow gets streamlined there will be a time window where you can leverage your experience to do 10x productivity.
by sualehasifon 3/14/2023, 10:56:10 PM
GPT-4 is just really good at code. We documented some of the examples that we found were crazy amazing and worth hackernews. Including GPT-4's ability to handle bazel, kube, terraform and rust. A summary can be found on this thread: https://twitter.com/sualehasif996/status/1635755267739598848
by zamnoson 3/15/2023, 12:09:22 AM
Is it? According to OpenAI's paper
```
                      GPT-4     GPT-4 (no vision)  GPT-3.5
    Leetcode (easy)   31 / 41   31 / 41           12 / 41
    Leetcode (medium) 21 / 80   21 / 80            8 / 80
    Leetcode (hard)    3 / 45    3 / 45            0 / 45
```
https://cdn.openai.com/papers/gpt-4.pdf Table 1; page 5.
So it's better than GPT-3.5, but still pretty pathetic at hard Leetcode problems. If your programming job is closer to leetcode easy problems you might be in trouble, but for the real problems that aren't just gluing some libraries together your job is safe.
by bob1029on 3/15/2023, 12:12:32 AM
My current take is that it's actually kind of bad at code in a lot of very important ways, but it is so good at getting to ~95% that you don't even feel the missing pieces once you are in flow. Having skills in the target area are important, but some quick runtime testing usually closes the gap.
The added confidence from being able to scream a horrible demand into the void and get back something nearly complete is beyond unbelievable to me. I'm making attempts at projects that I would never have considered otherwise, and I am becoming increasingly emboldened by the capability - "Without using 3rd party dependencies" is probably my #1 prompt nudge right now.
by beepbooptheoryon 3/15/2023, 12:09:17 AM
Honestly at this point I am just looking forward to us taking this stuff for granted. Its all very nice and useful, but I think my threshold for guffawing at it is smaller than most.
I know we still got some time left in the cycle, but I just want people to calm down and just start using it.
Yes, cool, you did xyz all at once, but what did you make with it? It's kinda akin to the whole "using rust" phenomenon. I only marginally care that you made it in rust, I care more about the thing itself you made!
I don't wanna come off sounding prescriptive, certainly stay in the hype world as long as you want, and I know there was a new bigger and better model that came out less than 12hrs ago. But just personally I am looking forward to the inevitable future where how you made the thing stops being as or more important as the what. Just miss the sense of inspiration you can gleam from someone else's work. Don't think AI will obsolete that ever, however much it can obsolete boilerplate code and even tricky discrete problems.
by flatironon 3/14/2023, 11:06:18 PM
Any concerns about sending code directly to a third party? I’m very hesitant to use any tool that is sending my assets to the cloud to be used who knows how.
by softwaredougon 3/15/2023, 12:32:57 AM
We're in a situation where junior side of the software job market has really dried up. My question is whether the junior side will ever come back? Will we ever have tech companies like we've had, with massive intern -> junior pipelines, gradually developing senior devs? Or will there by a narrower pipeline, with fewer senior devs working with AI tools?
An optimistic take is that by automating a lot of mundane work, our level of craftsmanship and creativity can increase. That we can have a much higher bar for quality and professionalism.
Unfortunately it will also mean a lot of junior folks may never get back into software jobs.
by sualehasifon 3/15/2023, 12:16:35 AM
OP here. We're hiring! We're an early-stage startup, with funding from OpenAI, building the next-generation AI code editor. If you're the best hacker or designer ever, reach out to me at sualeh@anysphere.co. The problems we work on are really fun (extremely fast context retrieval from across big codebases, connecting language servers to language models, instruction-tuned embedddings) and we think the potential of what we're building is enormous (make all software engineers literally 10x more productive).
by impulser_on 3/15/2023, 12:19:08 AM
Unless these models are updated frequently it not going to be better than a human.
ChatGPT doesn't even know Go has generics yet. Now imagine is suggesting you a inferior way of doing something because it doesn't know that a bug as been fixed or a better feature/version has been implemented.
It often recommends made up libraries, or libraries that don't exists anymore.
It even recommended using a Go wiki page as an import for an experimental Go library because it doesn't know the the experimental library has been merged into Go std lib.
Another big problem I had was with algorithms. It would suggestion working code for an algorithm but would contain bugs that would completely defeat using the algorithm in the first place.
So if you don't know what your doing you would think it was giving you working code because it compiles, but it in fact it gave you bad code.
Yeah, it might be good at doing very basic and boring things, but I would never trust it for anything complex. You won't know you shot yourself in the foot until later on when you have to spend hours debugging a bug because you C+P code from ChatGPT.
by LeftHandPathon 3/15/2023, 12:07:48 AM
Are software engineers going to go the way of human “computers”?
I know GPT-4 won’t put us all out of work, but I worry that something could in a number of decades I can count on one hand. Of course, it would be gradual.
I am excited, disappointed, and as a young person who hopes to be a SWE for a long time, somewhat afraid.
Then again, it could be argued that this merely means GPT4 is a compiler…
by stuff20230314on 3/14/2023, 11:57:21 PM
That's cool and all, but can it do "in this existing code, please amend this feature"? Also useful, "please cover this code with meaningful unit tests"?
by carapaceon 3/14/2023, 11:47:06 PM
Well, I think I just retired.
Even knowing this was coming, it's still a bit unnerving. Sad too, but it's for the best. Let the machines program the machines. Schmidhuber's Gödelmachine. The automatic scientist.
by hot_grilon 3/15/2023, 12:01:01 AM
I tried having it write some HTML and JS code for a chat app UI. It works amazingly well. I'm legitimately going to have ChatGPT help me write this side project website I've been working on.
by TrackerFFon 3/15/2023, 12:22:47 AM
In a couple of years the mythical 10x engineers will probably be 100x engineers, depending on how well they've managed to integrate this tech into their tooling.
by lisasayson 3/15/2023, 12:29:26 AM
I'm the "really good", but definitely not "phenomenal" camp.
That is, it works - but what is with the repeated class declaration for every h3 element?
```
    <h3 class="text-xl font-bold mb-2">1. Pruning</h3>
```
The whole point of CSS is to have to declare properties like this once, and only once, for a group of functionally similar elements.
Also note that switchTab has to be told the position of the tab it's switching to (it should be able to infer that the element's id, or its position in the list). Also note that the function has a hard-coded tab count baked in - a stealth bug waiting to be tripped on.
And why do we need to explicitly toggle the border-b-2 and border-yellow-400 properties? Shouldn't these be tucked away under some kind of class declaration (the same way the hidden property works)?
This may sound like nitpicking, but it's not - code like this becomes really hard to maintain at scale.
And just imagine someone asking this thing to generate an algorithm to say, determine whether someone's medical procedure should be covered under insurance or not.
by oztenon 3/15/2023, 12:02:27 AM
I don't know TLA+ and haven't had time to learn it. Could GPT-4 be a boon for esoteric PL that have had trouble gaining mass adoption?
Clearly, one needs to learn and understand TLA+ before one can trust generated proofs, but maybe GPT-4 could be the mentor or gentle on-ramp I was missing.
by willcodeforfooon 3/15/2023, 12:11:06 AM
I wonder if this could be used to port entire projects/libraries to different languages almost completely automatically? ElasticSearch in Rust? Wondering about the legal/license ramifications of doing so...
by ameliuson 3/15/2023, 12:10:35 AM
Since GPT is good at language, perhaps we can use it to generate documentation for software systems.
by amanzion 3/15/2023, 3:00:49 AM
I just asked GPT-4 to "generate an ansible playbook that will install oh-my-zsh on a linux computer?" It did a much better job of the playbook than the one I created last week. GPT-4 did it in a few seconds, and mine took about an hour of Googling.
by greenthrowon 3/15/2023, 12:00:56 AM
This is cool to see. I'd love to try it out and see how useful it is for me to do my job but I don't think it's likely that I'll be able to do that soon. I think eventually I will be using this every day just like I use lots of other tools.
by awbon 3/15/2023, 12:11:10 AM
I had endless trouble trying to get GPT 3 and now 4 to reverse engineer a generic function:
result1 = Func1(input1)
result2 = Func2(result1)
I want to give it result2 and have it tell me what input1 should be. It’s attempts to implement a binary search have all failed.
Other programs have been largely positive though.
by primaxon 3/15/2023, 12:09:29 AM
I'd like to start writing something based on GPT to build web scrapers, probably on top of Colly and Gorm. Is there a good resource for learning how to feed external data into GPT?
by hammockon 3/15/2023, 12:09:21 AM
What is the cost per day to use GPT-4 to code vs paying a developer?
by johnrushxon 3/16/2023, 10:45:13 PM
The future of programming is in AI - tools like replit, marsx.dev, and github copilot are bound to impress us soon.
by cozzydon 3/15/2023, 12:18:28 AM
I wonder how well it compares to a compiler at converting C++ or rust to assembly. (C is too easy, almost anyone can do that by hand...)
by baerrieon 3/15/2023, 2:10:27 AM
Until GPT becomes more trustworthy and fun to work with than the average person, my job is safe :)
by ameliuson 3/15/2023, 12:08:33 AM
My Ubuntu system is in an unworkable state. Snap is broken, Apt is stuck. Can GPT-4 fix it?
by xwdvon 3/15/2023, 12:04:23 AM
Could it write an ORM library?
by jabloczkoon 3/15/2023, 12:23:04 AM
What I've noted is that GPT is really good at things that have really good documentation.
The example of kube-tf in this repo is a perfect example. The Kubernetes documentation and all of Hashicorp's documentation is excellent. GPT will have infinite examples of good code to stitch together code for the task in this example.
Now I've been running a private cloud at work on the OpenNebula platform, which has documentation that is definitely lacking. I tried to ask GPT to write some basic code in Python such as "Give me a list of VMs from the OpenNebula API in a powered off state that have a start time older than 30 days."
What I noted was that it would spit out code that looked correct on the surface, but would not run. It would take a decent amount of me modifying the code until I got my desired result. Since there was no documentation, I was just reading through the OpenNebula packages themselves to understand what to do.
The nice thing, though, is that it was a great starting point. Much in the same way I might take a code snippet from StackOverflow and modify it to suit my own needs.
I listened to a great podcast titled "The Trouble with AI" on Making Sense with Sam Harris. One of the key takeaways I grabbed from it was that GPT is an LLM not an AI. What it is very good at is predicting the next correct character or word in a sequence based off of other examples. But it does not actually fundamentally understand what it is outputting.
In order to demonstrate, open up a session with ChatGPT really quickly and ask it a single digit multiplication question. Such as "What is 3 multiplied by 4?" and you will see a correct answer.
Next, ask it something a bit larger, like "What is 12366 multiplied by 981632?" and you will get an incorrect answer but one that looks pretty close to correct. Validate with a calculator yourself.
The reason being, as an LLM it doesn't actually understand multiplication. Instead it has just seen 3 multiplied by 4 countless times in the data it ingested when it was being "trained", but never has seen larger number examples of multiplication. Not that it knows multiplication in the first place.
GPT is fantastic, but as of right now it needs to be used as a starting point towards knowledge or something concrete. I wouldn't trust it as an authoritative source on anything quite yet. It is fantastic for generating a bit of code and then allowing the developer to tweak that code until it actually works.
by j3son 3/15/2023, 1:34:24 AM
writing code is like 10% of what most people in tech do. this will impact startups the most. but GPT-N can't have a conversation for you.
by recuteron 3/14/2023, 11:58:37 PM
Remember the comments about artists and creatives when DALL-E came out? I remember.