In conclusion, Google selected 178 relatively easy issues out of their 80K BUG database and found out Gemini 1.5 was kind of good when dealing with machine-detected bugs.
Maybe its time to build some post-ut automated patch generation CI pipeline?
And I think the other ongoing experiment mentioned in the paper is more interesting.
``` investigating the ability of an agent to generate bug-reproducing tests ```
In conclusion, Google selected 178 relatively easy issues out of their 80K BUG database and found out Gemini 1.5 was kind of good when dealing with machine-detected bugs.
Maybe its time to build some post-ut automated patch generation CI pipeline?
And I think the other ongoing experiment mentioned in the paper is more interesting. ``` investigating the ability of an agent to generate bug-reproducing tests ```