Cointime

Download App
iOS & Android

Auditing with ChatGPT: Complementary But Incomplete

Validated Project

In November 2022, OpenAI launched ChatGPT, an innovative Artificial Intelligence (AI) project. In addition to summarizing articles, crafting essays, and even writing jokes and poems, ChatGPT can be used to debug and generate code. With more than $3.7 billion lost to hacks and scams of Web3 projects, some people wondered if this new technology could improve insecure smart contract code.

ZKasino, a decentralized betting platform, recently engaged in a pre-audit with ChatGPT. ZKasino hoped that ChatGPT could give it an initial security review while CertiK’s comprehensive audit was still in progress. The team wanted to test the capabilities of ChatGPT as a smart contract auditor. So how did it perform? Is AI ready to take over from expert manual code auditors, or does the human touch still have something to offer?

On Dec. 23, 2022, ZKasino “hired” ChatGPT to identify potential security issues in their smart contracts. The tool raised several concerns that sounded valid on the surface.

While ChatGPT undeniably provides a valuable service to the Web3 security community, we found that there is quite a lot of room for improvement. ChatGPT missed a number of important vulnerabilities while giving false positives for good code.

We hope that our insight and recommendations can help ChatGPT become an even stronger tool for securing Web3 applications. The following sections present our findings on these two types of mistakes.While ChatGPT undeniably provides a valuable service to the Web3 security community, we found that there is quite a lot of room for improvement.

What Did ChatGPT Find?

What Did ChatGPT Miss?

ChatGPT mentioned several common security concerns that can be found in many smart contract implementations. However, it failed to identify certain serious security issues, including:

  • Project-specific logic vulnerabilities
  • Inaccurate math calculations and statistical models
  • Inconsistencies between implementation and design intention

Vulnerability #1: Project-Specific Logic

ChatGPT failed to identify a critical vulnerability, leaving ZKasino users vulnerable to an exploit where attackers could consistently win and drain funds from the Bankroll contract. Players can join the game by calling the Verifiable Randomness Function (VRF), and Chainlink's VRF will trigger the fulfillRandomWords() function with random numbers to complete the game. ZKasino’s code allowed for a refund of users' wagers that could be triggered if the calling of fulfillRandomWords() fails.

Figure 1: A consistent winning attack strategy

During CertiK’s code review of the same smart contract code, a potentially harmful _transferPayout() invocation was discovered, The function was designed to transfer winning payouts to the player's account. An attacker can maliciously revert the _transferPayout() if they lose, causing the entire fulfillRandomWords() call to fail. This invokes a waiting period of 100 blocks and leads to the invocation of CoinFlip_Refund() for a refund, meaning the attacker would never lose money.

While the transfer failure issue was recognized by ChatGPT, the potential attack methods linked to the project design were not. Thus, the impact of the failure combined with the project's logic was not identified by ChatGPT. See ZKasino’s full audit report for a description of the specific attack flow.

Vulnerability Missed #2: Inaccurate Math Calculation and Statistical Models

Ensuring randomness and outcomes which meet reasonable expectations are of the utmost importance in any gaming project. To confirm this, the randomness of each game outcome was thoroughly evaluated during the audit process. Though ChatGPT acknowledges the significance of this matter, it does not detect any cases of unfairness. ChatGPT brings up the use of VRF and the potential for unfair outcomes if the VRF contract is compromised or manipulated:

“If the VRF contract is not secure or is manipulated, it could potentially lead to unfair outcomes for the game.”

However, this conclusion is limited and does not address the root causes of unfairness. We found a number of potential issues regarding randomness in the course of our audit.

Unfair Randomness Distribution

One medium-level issue found regarding randomness is the unfair random number usage issue in the VideoPoker game, where players have less chance to get certain cards.

Decimal Truncation

Another issue was discovered in the Dice game, which would have allowed players to choose special multipliers to maximize their expected returns.

Vulnerability #3: Inconsistencies Between Implementation and Intended Design

ChatGPT is often able to understand the implementation of a single function, while failing to grasp the design's underlying purpose. For example, it may understand the technical execution of a certain function, but not be able to place the purpose of this function in the broader context of the smart contract. To ensure that ChatGPT does not make mistakes in its coding, it needs to better understand smart contract code logic. As it currently stands, ChatGPT provides a surface level reading of the code. To take its auditing to the next level, it must be able to work backwards from a function to derive its initial logic: a significant task.

Incorrect Input Validation

An input validation issue was discovered in the Plinko contract, resulting in incorrect multipliers setting.

According to ZKasino, the number of rows used in Plinko should be 8 to 16. However, the Bankroll contract owner can set a row number value outside the expected range through the function setPlinkoMultipliers() because of a bug in the below check:

The code indicates the transaction will revert if both numRows and risk are invalid. However, if only one of two criteria is invalid, the check will still pass, and the code will not revert.

ChatGPT gave a different answer in response to the second inquiry: “The function then checks if the value of "numRows" is between 8 and 16, and if the value of "risk" is less than 3. If either of these conditions are not met, the function reverts with the error "InvalidNumberToSet".

ChatGPT appears to comprehend the purpose of the function. Nevertheless, it does not possess the knowledge of the suitable application and cannot identify the real vulnerability without extra information.

Inconsistent Value Update

In the Slots contract, an issue related to an inconsistent update to totalValue was identified, which could result in the game ending prematurely. The totalValue was used to monitor user's winnings or losses, but it only kept track of the payout and failed to deduct the wager, leading to an incorrect calculation of the user's gain or loss.

Conclusion

Despite its training, ChatGPT misses certain important security issues in its audits. This is due to the limitations of AI in fully understanding the complexities and nuances of code, as well as its lack of hands-on experience in real-world scenarios. As stated on its official website, ChatGPT is a research release that relies on natural language processing for dialogue purposes. It is often unable to understand the intent and reasoning behind the code as well as a human auditor can. As such, it is important to supplement ChatGPT's analysis with manual audits by experienced security experts to ensure accuracy.

The following summary highlights the strengths and weaknesses of human-based services and ChatGPT on various criteria.

The effectiveness of ChatGPT's answers is largely dependent on the format of the prompt. In this blog, we compare the pre-audit results of our customer's interactions with ChatGPT and the final audit results performed by experts at CertiK. As technology improves and a clearer understanding of prompt engineering arises, engineers will be able to make better use of ChatGPT. Keep a lookout for our future blog posts, in which we delve into the art and science of prompt engineering: posing effective questions to ChatGPT.

Read more: https://www.certik.com/resources/blog/6oBs1st22AsSYxpF7ENoiX-auditing-with-chatgpt-complementary-but-incomplete

Get the latest news here: Cointime channel — https://t.me/cointime_en

Comments

All Comments

Recommended for you

  • Spanish Foreign Minister: Not worried about any consequences of refusing US access to military bases

     on March 3 local time, Spanish Foreign Minister Alvarez defended the Spanish government's refusal to provide the Rota and Moron military bases to the United States for participation in attacks on Iran. Alvarez stated that the operation initiated by the United States and Israel is not supported by the United Nations and is not part of the bilateral agreements allowing the use of the aforementioned Spanish sovereign military bases. Alvarez also said that the Spanish government is not concerned that this stance will have any consequences. Alvarez stated: "The position of the Spanish government represents the will of the vast majority of the Spanish people as well as the vast majority of people worldwide, which is to defend the UN Charter, respect international law, and believe that cooperation is always more powerful than confrontation."

  • Spot gold plunges nearly $100 in the short term.

     spot gold plunged nearly 100 dollars in a short time, spot gold fell below 5170 dollars/ounce, with a daily decline of 2.94%. 

  • BTC falls below $67,000

    the market shows BTC fell below $67,000, currently at $66,996.93, with a 24-hour increase of 1.18%. The market is highly volatile, please manage your risk accordingly.

  • ETH breaks $2,000

    the market shows ETH breaking through $2000, currently at $2001.64, with a 24-hour increase of 2.89%. The market is highly volatile, please manage your risks accordingly.

  • The US spot Bitcoin ETF saw a net inflow of $962.48 million yesterday.

    according to Trader T's monitoring, the US spot Bitcoin ETF had a net inflow of 962.48 million USD yesterday.

  • BTC falls below $66,000

     the market shows BTC fell below 66,000 USD, currently at 65,986.66 USD, with a 24-hour decline of 1.31%. The market is highly volatile, please manage your risks accordingly.

  • BTC falls below $66,000

     the market shows BTC fell below $66,000, currently at $65,973.16, a 24-hour drop of 2.66%. The market is highly volatile, please manage your risks accordingly.

  • ETH breaks $2,000

    market shows ETH breaking through $2000, currently at $2000.29, with a 24-hour increase of 3.73%. The market is volatile, please manage your risk accordingly.

  • The United States uses Anthropic's artificial intelligence technology in its airstrikes in the Middle East.

     United States used Anthropic's artificial intelligence technology in airstrikes in the Middle East, and just hours before the attack, Trump had just issued a ban against Anthropic. 

  • Web3 data and AI company Validation Cloud completes $10 million in new round of financing

     Web3 data and AI company Validation Cloud announced a $10 million financing round from True Global Ventures. The company plans to use the funds to expand its AI products and achieve seamless access to Web3 data.