Dev Notes
Posts
AI's Role in Bug-Free Software

AI's Role in Bug-Free Software

Meghanadh Vasireddy
January 08, 2024

Good Morning! AI systems like Baldur show promise in automating the error-prone process of generating mathematical proofs for software verification. A SQL expert managed to implement a full GPT-style language model in just 500 lines of code, demonstrating the versatility of SQL. Some argue that scientists need not follow strict coding standards, as their code is often short-term and goal-oriented.

— Forrest Knight & Meghanadh Vasireddy

AI's Role in Bug-Free Software

A team led by Emily First at the University of Massachusetts Amherst has developed a method for automatically generating mathematical proofs to verify software correctness. The technique, known as machine-checking, involves creating a proof that the software performs as expected, and then using a theorem prover to confirm the proof's accuracy.

Traditionally, writing these proofs has been a laborious task, requiring extensive expertise and often resulting in proofs longer than the software code itself. To overcome this, First's team utilized Minerva, a large language model (LLM), trained on a vast corpus of natural-language text and fine-tuned on mathematical scientific papers and webpages. The LLM was further fine-tuned on a language called Isabelle/HOL, used for writing mathematical proofs.

The team introduced Baldur, an AI system that generates an entire proof and works with the theorem prover to check its work. When the theorem prover identifies an error, it feeds the proof and error information back into the LLM, enabling it to learn from its mistakes and generate a new, hopefully error-free, proof.

The state-of-the-art tool for automatically generating proofs, Thor, can generate proofs 57% of the time. However, when Baldur is paired with Thor, the success rate increases to 65.7%. Despite a degree of error, Baldur represents the most effective and efficient method yet devised to verify software correctness. As AI capabilities continue to evolve and refine, the effectiveness of Baldur is expected to grow.

Read More Here

GPT in 500 lines of SQL

Alex Bolenok, a renowned programmer, has managed to implement a full GPT-2 style language model in SQL, using just 500 lines of code. This remarkable achievement was shared on his blog, EXPLAIN EXTENDED, on the last day of 2023.

The GPT-2 model, known for its proficiency in generating human-like text, has been condensed into a mere 500 lines of SQL code. This shows the power of SQL and its potential in handling complex tasks. The final inference query, as Bolenok demonstrated, is 498 lines long.

Read More Here

Why bad scientific code beats code following "best practices"

The argument is that scientists, who often view coding as a means to an end, may not benefit from the stringent standards of software engineering. They are primarily interested in solving specific problems, and spending time on "good" coding practices could detract from their research efforts. Moreover, the code written by scientists may not need to be as robust or scalable as commercial software, as it is often used for specific, short-term projects.

However, this perspective has its critics. Poorly written code can lead to errors, slow performance, and crashes, which can directly hamper scientific work. Furthermore, a recent study showed that low-quality code can have significant business impacts, including longer development times and more defects. This suggests that investing in code quality could lead to more efficient and reliable scientific computing.

Read More Here

ChatGPT Apps: Rising Popularity, Personal Info Caution

Artificial Intelligence (AI) has revolutionized the way we work, with AI-powered productivity apps becoming increasingly popular. However, recent investigations have raised concerns about the privacy and security of these tools, particularly those using the ChatGPT API.

Security researchers at Private Internet Access (PIA) found troubling instances of poor transparency in the privacy policies of these apps. One such app, a popular AI chat assistant, uses the ChatGPT API and its existing database to tailor responses to user prompts. While this feature enhances user experience, it also raises questions about how personal data is handled.

In Italy, the data protection authority, Garante, has raised concerns about ChatGPT's compliance with the General Data Protection Regulation (GDPR). The regulator believes that OpenAI, the company behind ChatGPT, lacks adequate age controls and legal basis for using people's personal information.

Read More Here

Was this forwarded to you? Sign Up Here