Cointime

Download App
iOS & Android

How Effective Is GPT for Auditing Smart Contracts?

Introduction

Recently, ChatGPT has gained a great deal of popularity, impressing its users with its capacity to enhance traditional text, work efficiency, and provide concise overviews. Following closely behind is CodeGPT, a GPT-based plugin that further enhances coding efficiency. With the recent release of GPT-4, can it be applied to auditing blockchain and Solidity smart contracts? Based on this question, we conducted various feasibility tests.’

Testing Environment and Methodology

The comparison models used in this test are: GPT-3.5(Web),GPT-3.5-turbo-0301,GPT-4(Web).

Prompt used in the test: Help me discover vulnerabilities in this Solidity smart contract.

Comparison of Vulnerability Code Snippet Detectio

We performed three rounds of testing. In tests 1 and 2, we utilized historical vulnerability codes commonly encountered in the past as test cases to evaluate the model’s ability to detect fundamental vulnerabilities. In Test 3, we introduced moderately challenging vulnerability codes as the primary test cases.

Test 1:

Example: “Intro to Smart Contract Audit Series: Phishing With tx.orgin”

Vulnerability Code:

Sent to GPT:

GPT-3.5(Web) Response:

GPT-3.5-turbo-0301 Response:

GPT-4(Web) Response:

As you can see from the results, all three models identified critical issues related to tx.origin.

Test 2:

Example: “Intro to Smart Contract Security Audits | Overflow”

Sent to GPT:

GPT-3.5(Web) Response:

GPT-3.5-turbo-0301 Response:

GPT-4(Web) Response:

It is worth noting that both GPT-3.5 (Web) and gpt-3.5-turbo-0301 were able to identify a critical overflow vulnerability, whereas surprisingly, GPT-4 (Web) did not provide any relevant prompt.

Test 3:

Example: “Empty-handed with a White Wolf — Analysis of the Popsicle Hack”

Sent to GPT:

GPT-3.5(Web) Response:

GPT-3.5-turbo-0301 Response:

Looking at the results,, we can see that none of the three versions detected any of the critical vulnerability points.

Summary of Code Snippet Detection

While the GPT models displayed adequate detection capabilities for simple vulnerability code snippets, it falls short when it comes to identifying more complex ones. Throughout the tests, GPT-4 (Web) showcased exceptional readability and a clear output format. However, its ability to audit code does not appear to surpass that of GPT-3.5 (Web) or GPT-3.5-turbo-0301. In some cases, due to the inherent uncertainties in the transformer output, GPT-4 (Web) managed to overlook certain critical issues.

Comparative Detection of Known Vulnerabilities in Full Contracts

To better accommodate the practical requirements of projects during contract audits, we raised the difficulty level by importing contracts with an extensive codebase. This allowed us to comprehensively test the GPT-4 model’s auditing capabilities, as opposed to GPT-3 which has a smaller contextual character limit and thus was not evaluated in this context.

For this instance, we used previous case studies as a test template to simulate real-world scenarios:

Example: “Detailed analysis of the $31 Million MonoX Protocol Hack”.

To initiate the audit, we inputted the complete contract in batches and submitted a vulnerability detection request towards the end of the dialogue.

The following prompt was utilized for this test:

“Here is a Solidity smart contract”

Insert Contract Code

“The above is the complete code,help me discover vulnerabilities in this smart contract.”

As demonstrated, despite GPT-4 having the highest single input character limit, according to the information published by OpenAI, it still encountered contextual challenges due to text overflow during the final vulnerability detection request. Consequently, the model can only identify a portion of the content, rendering it incapable of conducting a thorough contextual audit for large-scale contracts.

Batched Auditing: Unpacked Contracts through Incremental Input and Detection:

Prompt 1:

“Help me discover vulnerabilities in this Solidity smart contract.”

Batch 1 of the contract code.

Prompt 2:

“Help me discover vulnerabilities in this Solidity smart contract.”

Batch 2 of the contract code.

Prompt 3:

“Help me discover vulnerabilities in this Solidity smart contract.”

Batch 3 of the contract code.

It is worth mentioning that GPT-4 failed to identify any critical vulnerability points.

Summary: While the current state of GPT’s capabilities may not be entirely suitable for contract analysis, the potential of AI in this domain remains impressive.

Advantages:

While GPT’s detection capabilities for complex vulnerabilities in contract code may be limited, it has shown impressive partial detection capabilities for basic and simple vulnerabilities. Additionally, once a vulnerability is identified, the model provides an explanation in an easily understandable and user-readable format. This unique feature is especially beneficial for novice contract auditors who require quick guidance and straightforward answers during their initial training phase.

Challenges:

There is a certain amount of variation in GPT’s output for each dialogue, which can be adjusted through API interface parameters. However, the output is still not constant. Although such variability is beneficial for language dialogues and greatly enhances the authenticity of the conversation, it is not ideal for code analysis work. In order to cover multiple possible vulnerability answers that AI may provide, we had to make multiple requests for the same question and compare and filter the results. This inadvertently increases the workload, ultimately undermining the fundamental objective of AI in assisting humans to improve their efficiency.

For instance, we conducted an additional test by running Test 2 of the Comparison of Vulnerability Code Snippet Detection with a slight modification of the function name before generating again.

As we can see, its output results have added some additional content compared to the previous test.

There is still significant room for improvement in its vulnerability analysis capabilities.

It is worth noting that the current (as of March 16, 2024) training models of GPT are unable to accurately analyze and identify critical vulnerability points for slightly complex vulnerabilities.

Despite the current limitations of GPT’s analysis and mining capabilities for contract vulnerabilities, its ability to analyze and generate reports on simple code blocks for common vulnerabilities still sparks excitement among users. With continued training and development of GPT and other AI models, we firmly believe that assisted auditing of large and complex contracts will achieve faster, more intelligent, and more comprehensive outcomes in the foreseeable future. As technological development exponentially improves human efficiency, a transformative shift is imminent. We eagerly anticipate the benefits of AI in enhancing blockchain security and remain vigilant in monitoring the impact of emerging AI products on this vital field. In the visible future, we will inevitably integrate with AI to some extent. May AI and blockchain be with you.

Read more: https://slowmist.medium.com/how-effective-is-gpt-for-auditing-smart-contracts-cdeddfa76dbe

Comments

All Comments

Recommended for you

  • Another Iranian Oil Tanker Returns to Iran After Breaking US Blockade

    On April 21, according to CCTV News, maritime intelligence company 'TankerTrackers' reported that a tanker belonging to the National Iranian Tanker Company returned to Iran after unloading approximately 2 million barrels of crude oil in Indonesia, crossing the relevant maritime blockade line. The tanker is currently en route to Iran's main oil export hub, Khark Island, and is expected to arrive on April 22 local time. It is reported that the tanker set sail from Iran in late March, heading towards the Riau Islands of Indonesia.

  • White House: US and Iran on the Verge of Reaching an Agreement

    On April 21, White House Press Secretary Kayleigh McEnany stated in an interview with Fox News on the evening of the 20th that the United States and Iran are on the "verge of reaching an agreement." McEnany remarked, "The US has never been closer to achieving a truly good deal." However, she did not disclose any information regarding the current status of the negotiations. McEnany noted that even if an agreement is not reached, President Trump has multiple options and is not afraid to utilize these measures. Previous actions have demonstrated that Trump is not just "bluffing."

  • Kelp DAO Attacker Transfers 30,800 ETH to Special Address

    On April 21, news emerged that, according to monitoring by PeckShield, the Kelp DAO attacker transferred 30,800 ETH to a special address starting with 0x00000, possibly indicating a destruction action.

  • Trump: 'Midnight Hammer' Completely Dismantled Iran's Nuclear Dust Base

    On April 21, U.S. President Trump stated that the 'Midnight Hammer' operation has completely destroyed the 'nuclear dust' base within Iran. As a result, the cleanup will be a long and arduous process. The fake news media, including CNN and other corrupt media networks and platforms, have failed to give our great pilots the credit they deserve, instead always attempting to belittle and undermine them. They are losers!!! (Dongxin News Agency)

  • BTC Drops Below $76,000

    Market data shows that BTC has dropped below $76,000, currently priced at $75,999.63, with a 24-hour increase of 1.68%. The market is experiencing significant volatility, so please ensure proper risk management.

  • Japan Officially Allows Export of Lethal Weapons Through Cabinet Resolution

    On April 21, according to Kyodo News, the Japanese government officially revised the 'Three Principles on Transfer of Defense Equipment' and its operational guidelines during a cabinet meeting, which will, in principle, allow the export of lethal weapons. (Xinhua News Agency)

  • Trump Claims Iran Will Negotiate

    On April 21, during a phone interview with CNN, U.S. President Trump stated that Iran "will negotiate" and expressed confidence in potential talks set to take place in Pakistan. Trump remarked, "They will negotiate; if they don't, they will face unprecedented problems." He also expressed hope that both sides could reach a "fair agreement" and emphasized that Iran "will not have nuclear weapons." Additionally, he defended military actions against Iran by stating there was "no choice" and claimed that they would ultimately "wrap things up."

  • Amazon to Invest Additional $5 Billion in Anthropic

    On April 21, Amazon announced on Monday that it will invest an additional $5 billion in the artificial intelligence company Anthropic, bringing the total investment to as much as $20 billion. Anthropic develops the Claude chatbot and programming tools, and plans to invest over $100 billion in Amazon's cloud technology and chips over the next decade.

  • Three U.S. Carrier Strike Groups May Deploy Simultaneously in the Middle East

    On April 21, according to CCTV, the U.S. military is expected to deploy three carrier strike groups simultaneously in the Middle East in the coming days. Currently, the USS Lincoln strike group is stationed in the Gulf of Oman, near the Strait of Hormuz, participating in maritime blockade operations; the USS Ford strike group is located in the northern Red Sea; and the USS Bush strike group, which is taking a route around Africa, is heading north from the southeast of Africa and is expected to enter the Arabian Sea—this carrier may replace the USS Ford in its mission. In the short term, the U.S. military may have three aircraft carriers in the Middle East.

  • BTC Surpasses $76,000

    Market data shows that BTC has surpassed $76,000, currently priced at $76,039.83, with a 24-hour increase of 1.67%. The market is highly volatile, so please ensure proper risk management.