Cointime

Download App
iOS & Android

How Effective Is GPT for Auditing Smart Contracts?

Introduction

Recently, ChatGPT has gained a great deal of popularity, impressing its users with its capacity to enhance traditional text, work efficiency, and provide concise overviews. Following closely behind is CodeGPT, a GPT-based plugin that further enhances coding efficiency. With the recent release of GPT-4, can it be applied to auditing blockchain and Solidity smart contracts? Based on this question, we conducted various feasibility tests.’

Testing Environment and Methodology

The comparison models used in this test are: GPT-3.5(Web),GPT-3.5-turbo-0301,GPT-4(Web).

Prompt used in the test: Help me discover vulnerabilities in this Solidity smart contract.

Comparison of Vulnerability Code Snippet Detectio

We performed three rounds of testing. In tests 1 and 2, we utilized historical vulnerability codes commonly encountered in the past as test cases to evaluate the model’s ability to detect fundamental vulnerabilities. In Test 3, we introduced moderately challenging vulnerability codes as the primary test cases.

Test 1:

Example: “Intro to Smart Contract Audit Series: Phishing With tx.orgin”

Vulnerability Code:

Sent to GPT:

GPT-3.5(Web) Response:

GPT-3.5-turbo-0301 Response:

GPT-4(Web) Response:

As you can see from the results, all three models identified critical issues related to tx.origin.

Test 2:

Example: “Intro to Smart Contract Security Audits | Overflow”

Sent to GPT:

GPT-3.5(Web) Response:

GPT-3.5-turbo-0301 Response:

GPT-4(Web) Response:

It is worth noting that both GPT-3.5 (Web) and gpt-3.5-turbo-0301 were able to identify a critical overflow vulnerability, whereas surprisingly, GPT-4 (Web) did not provide any relevant prompt.

Test 3:

Example: “Empty-handed with a White Wolf — Analysis of the Popsicle Hack”

Sent to GPT:

GPT-3.5(Web) Response:

GPT-3.5-turbo-0301 Response:

Looking at the results,, we can see that none of the three versions detected any of the critical vulnerability points.

Summary of Code Snippet Detection

While the GPT models displayed adequate detection capabilities for simple vulnerability code snippets, it falls short when it comes to identifying more complex ones. Throughout the tests, GPT-4 (Web) showcased exceptional readability and a clear output format. However, its ability to audit code does not appear to surpass that of GPT-3.5 (Web) or GPT-3.5-turbo-0301. In some cases, due to the inherent uncertainties in the transformer output, GPT-4 (Web) managed to overlook certain critical issues.

Comparative Detection of Known Vulnerabilities in Full Contracts

To better accommodate the practical requirements of projects during contract audits, we raised the difficulty level by importing contracts with an extensive codebase. This allowed us to comprehensively test the GPT-4 model’s auditing capabilities, as opposed to GPT-3 which has a smaller contextual character limit and thus was not evaluated in this context.

For this instance, we used previous case studies as a test template to simulate real-world scenarios:

Example: “Detailed analysis of the $31 Million MonoX Protocol Hack”.

To initiate the audit, we inputted the complete contract in batches and submitted a vulnerability detection request towards the end of the dialogue.

The following prompt was utilized for this test:

“Here is a Solidity smart contract”

Insert Contract Code

“The above is the complete code,help me discover vulnerabilities in this smart contract.”

As demonstrated, despite GPT-4 having the highest single input character limit, according to the information published by OpenAI, it still encountered contextual challenges due to text overflow during the final vulnerability detection request. Consequently, the model can only identify a portion of the content, rendering it incapable of conducting a thorough contextual audit for large-scale contracts.

Batched Auditing: Unpacked Contracts through Incremental Input and Detection:

Prompt 1:

“Help me discover vulnerabilities in this Solidity smart contract.”

Batch 1 of the contract code.

Prompt 2:

“Help me discover vulnerabilities in this Solidity smart contract.”

Batch 2 of the contract code.

Prompt 3:

“Help me discover vulnerabilities in this Solidity smart contract.”

Batch 3 of the contract code.

It is worth mentioning that GPT-4 failed to identify any critical vulnerability points.

Summary: While the current state of GPT’s capabilities may not be entirely suitable for contract analysis, the potential of AI in this domain remains impressive.

Advantages:

While GPT’s detection capabilities for complex vulnerabilities in contract code may be limited, it has shown impressive partial detection capabilities for basic and simple vulnerabilities. Additionally, once a vulnerability is identified, the model provides an explanation in an easily understandable and user-readable format. This unique feature is especially beneficial for novice contract auditors who require quick guidance and straightforward answers during their initial training phase.

Challenges:

There is a certain amount of variation in GPT’s output for each dialogue, which can be adjusted through API interface parameters. However, the output is still not constant. Although such variability is beneficial for language dialogues and greatly enhances the authenticity of the conversation, it is not ideal for code analysis work. In order to cover multiple possible vulnerability answers that AI may provide, we had to make multiple requests for the same question and compare and filter the results. This inadvertently increases the workload, ultimately undermining the fundamental objective of AI in assisting humans to improve their efficiency.

For instance, we conducted an additional test by running Test 2 of the Comparison of Vulnerability Code Snippet Detection with a slight modification of the function name before generating again.

As we can see, its output results have added some additional content compared to the previous test.

There is still significant room for improvement in its vulnerability analysis capabilities.

It is worth noting that the current (as of March 16, 2024) training models of GPT are unable to accurately analyze and identify critical vulnerability points for slightly complex vulnerabilities.

Despite the current limitations of GPT’s analysis and mining capabilities for contract vulnerabilities, its ability to analyze and generate reports on simple code blocks for common vulnerabilities still sparks excitement among users. With continued training and development of GPT and other AI models, we firmly believe that assisted auditing of large and complex contracts will achieve faster, more intelligent, and more comprehensive outcomes in the foreseeable future. As technological development exponentially improves human efficiency, a transformative shift is imminent. We eagerly anticipate the benefits of AI in enhancing blockchain security and remain vigilant in monitoring the impact of emerging AI products on this vital field. In the visible future, we will inevitably integrate with AI to some extent. May AI and blockchain be with you.

Read more: https://slowmist.medium.com/how-effective-is-gpt-for-auditing-smart-contracts-cdeddfa76dbe

Comments

All Comments

Recommended for you

  • BTC Surpasses $70,000

    Market data shows that BTC has broken through $70,000, currently trading at $70,011.9. The 24-hour decline has narrowed to 1.11%. The market is experiencing significant volatility, so please implement risk control measures.

  • BTC Drops Below $69,500

    Market data shows that BTC has fallen below $69,500, currently trading at $69,492.81. It has experienced a 2.2% decline in the past 24 hours. The market is experiencing significant volatility, so please implement risk control measures.

  • CLARITY Act Draft: Ban on Stablecoin Yields for Holding Only

    On March 24, according to CoinDesk, cryptocurrency industry practitioners on Monday saw the latest provisions regarding stablecoin yields in the revised version of the Senate's "Digital Asset Market Clarity Act" for the first time during a closed-door review meeting on Capitol Hill in Washington. The initial impression was that the relevant language was too narrow and lacked clarity. This new provision was released last Friday by Senators Angela Alsobrooks and Thom Tillis. According to a person familiar with the current draft, the new provision will prohibit earning yields solely from holding stablecoins, while restricting any practices that equate such programs with bank deposits, and imposing further limitations on other potentially permissible activities. The specific mechanism for determining activity-based stablecoin rewards remains unclear. This compromise stems from the lobbying battle between the crypto and banking industries. The banking industry insists that stablecoin rewards should not resemble interest-bearing bank deposits, arguing that such competing products could harm the banking sector and stifle lending. The final compromise allows for reward programs based on user stablecoin activities but prohibits balance-based rewards. This closed-door review aims to push the Senate Banking Committee to schedule a hearing, a significant step for the bill towards a full Senate vote. Similar versions of the "Clarity Act" have passed the House of Representatives in previous years, and another version has also passed the Senate Agriculture Committee's markup process. The bill's progress still faces other obstacles: all parties still need to reach an agreement on the DeFi regulatory framework, and Democrats are simultaneously insisting on including provisions that prohibit senior government officials from seeking personal gain from the cryptocurrency industry, a clause clearly targeting President Trump. (Dongxin News Agency)

  • Iran's IRGC: All Vessels Must Coordinate Passage Through Strait

    According to Al Jazeera: The Iranian Revolutionary Guard Corps (IRGC) Navy stated that the container ship 'Celine' was forced to leave the area because it did not possess a permit to pass through the Strait of Hormuz. The IRGC Navy further stated that any vessel transiting the Strait of Hormuz must coordinate fully with Iranian maritime authorities. (Jins10)

  • Circle Shares Plunge Over 16%, Hitting Largest Single-Day Drop Since June 2025

    Circle (CRCL) shares fell by more than 16% intraday, marking the largest single-day decline since June 2025. The stock is currently trading at $106.1.

  • BTC Drops Below $70,000

    Market data shows that BTC has fallen below $70,000, currently trading at $69,995.57. The cryptocurrency has seen a 1.86% decrease in the last 24 hours, indicating significant price volatility. Investors are advised to manage their risk accordingly.

  • Nasdaq Extends Losses to 1%

    The Nasdaq extended its losses to 1%.

  • Iran Denies Peace Talks Rumors; US Stocks Open Lower

    March 24th news: US stocks opened lower, with the Dow Jones Industrial Average down 0.24%, the S&P 500 index down 0.62%, and the Nasdaq Composite down 0.63%. Li Auto (LI.O) rose 2.8% after announcing a $1 billion share buyback plan. Amazon (AMZN.O) fell 1% following a "service disruption" at its Amazon Web Services (AWS) region in Bahrain. (Jinshi)

  • Tether Hires Big Four Firm for First Full Audit

    On March 24, Tether announced it has engaged one of the Big Four accounting firms to complete its first full audit.

  • BlackRock Transfers 7,552 ETH to Coinbase Prime Address

    According to data monitored by Arkham, approximately one hour ago, BlackRock transferred a total of about 7,552 ETH to a Coinbase Prime address through its Ethereum exchange-traded fund, ETHA. The value of this transfer is approximately $16.31 million. Further transfer operations may follow.