This weakness can often be detected using automated static analysis tools. Many modern tools use data flow analysis or constraint-based techniques to minimize the number of false positives. Automated static analysis generally does not account for environmental considerations when reporting out-of-bounds memory operations. This can make it difficult for users to determine which warnings should be investigated first. For example, an analysis tool might report buffer overflows that originate from command line arguments in a program that is not expected to run with setuid or other special privileges.
Buffer Copy without Checking Size of Input ('Classic Buffer Overflow')
This vulnerability occurs when a program copies data from one memory location to another without first verifying that the source data will fit within the destination buffer's allocated space.
What is CWE-120?
Real-world CVEs caused by CWE-120
-
buffer overflow using command with long argument
-
buffer overflow in local program using long environment variable
-
buffer overflow in comment characters, when product increments a counter for a ">" but does not decrement for "<"
-
By replacing a valid cookie value with an extremely long string of characters, an attacker may overflow the application's buffers.
-
By replacing a valid cookie value with an extremely long string of characters, an attacker may overflow the application's buffers.
Step-by-step attacker path
- 1
The following code asks the user to enter their last name and then attempts to store the value entered in the last_name array.
- 2
The problem with the code above is that it does not restrict or limit the size of the name entered by the user. If the user enters "Very_very_long_last_name" which is 24 characters long, then a buffer overflow will occur since the array can only hold 20 characters total.
- 3
The following code attempts to create a local copy of a buffer to perform some manipulations to the data.
- 4
However, the programmer does not ensure that the size of the data pointed to by string will fit in the local buffer and copies the data with the potentially dangerous strcpy() function. This may result in a buffer overflow condition if an attacker can influence the contents of the string parameter.
- 5
The code below calls the gets() function to read in data from the command line.
Vulnerable C
The following code asks the user to enter their last name and then attempts to store the value entered in the last_name array.
char last_name[20];
printf ("Enter your last name: ");
scanf ("%s", last_name); Secure pseudo
// Validate, sanitize, or use a safe API before reaching the sink.
function handleRequest(input) {
const safe = validateAndEscape(input);
return executeWithGuards(safe);
} How to prevent CWE-120
- Requirements Use a language that does not allow this weakness to occur or provides constructs that make this weakness easier to avoid. For example, many languages that perform their own memory management, such as Java and Perl, are not subject to buffer overflows. Other languages, such as Ada and C#, typically provide overflow protection, but the protection can be disabled by the programmer. Be wary that a language's interface to native code may still be subject to overflows, even if the language itself is theoretically safe.
- Architecture and Design Use a vetted library or framework that does not allow this weakness to occur or provides constructs that make this weakness easier to avoid. Examples include the Safe C String Library (SafeStr) by Messier and Viega [REF-57], and the Strsafe.h library from Microsoft [REF-56]. These libraries provide safer versions of overflow-prone string-handling functions.
- Operation / Build and Compilation Use automatic buffer overflow detection mechanisms that are offered by certain compilers or compiler extensions. Examples include: the Microsoft Visual Studio /GS flag, Fedora/Red Hat FORTIFY_SOURCE GCC flag, StackGuard, and ProPolice, which provide various mechanisms including canary-based detection and range/index checking. D3-SFCV (Stack Frame Canary Validation) from D3FEND [REF-1334] discusses canary-based detection in detail.
- Implementation Consider adhering to the following rules when allocating and managing an application's memory: - Double check that your buffer is as large as you specify. - When using functions that accept a number of bytes to copy, such as strncpy(), be aware that if the destination buffer size is equal to the source buffer size, it may not NULL-terminate the string. - Check buffer boundaries if accessing the buffer in a loop and make sure there is no danger of writing past the allocated space. - If necessary, truncate all input strings to a reasonable length before passing them to the copy and concatenation functions.
- Implementation Assume all input is malicious. Use an "accept known good" input validation strategy, i.e., use a list of acceptable inputs that strictly conform to specifications. Reject any input that does not strictly conform to specifications, or transform it into something that does. When performing input validation, consider all potentially relevant properties, including length, type of input, the full range of acceptable values, missing or extra inputs, syntax, consistency across related fields, and conformance to business rules. As an example of business rule logic, "boat" may be syntactically valid because it only contains alphanumeric characters, but it is not valid if the input is only expected to contain colors such as "red" or "blue." Do not rely exclusively on looking for malicious or malformed inputs. This is likely to miss at least one undesirable input, especially if the code's environment changes. This can give attackers enough room to bypass the intended validation. However, denylists can be useful for detecting potential attacks or determining which inputs are so malformed that they should be rejected outright.
- Architecture and Design For any security checks that are performed on the client side, ensure that these checks are duplicated on the server side, in order to avoid CWE-602. Attackers can bypass the client-side checks by modifying values after the checks have been performed, or by changing the client to remove the client-side checks entirely. Then, these modified values would be submitted to the server.
- Operation / Build and Compilation Run or compile the software using features or extensions that randomly arrange the positions of a program's executable and libraries in memory. Because this makes the addresses unpredictable, it can prevent an attacker from reliably jumping to exploitable code. Examples include Address Space Layout Randomization (ASLR) [REF-58] [REF-60] and Position-Independent Executables (PIE) [REF-64]. Imported modules may be similarly realigned if their default memory addresses conflict with other modules, in a process known as "rebasing" (for Windows) and "prelinking" (for Linux) [REF-1332] using randomly generated addresses. ASLR for libraries cannot be used in conjunction with prelink since it would require relocating the libraries at run-time, defeating the whole purpose of prelinking. For more information on these techniques see D3-SAOR (Segment Address Offset Randomization) from D3FEND [REF-1335].
- Operation Use a CPU and operating system that offers Data Execution Protection (using hardware NX or XD bits) or the equivalent techniques that simulate this feature in software, such as PaX [REF-60] [REF-61]. These techniques ensure that any instruction executed is exclusively at a memory address that is part of the code segment. For more information on these techniques see D3-PSEP (Process Segment Execution Prevention) from D3FEND [REF-1336].
How to detect CWE-120
This weakness can be detected using dynamic tools and techniques that interact with the software using large test suites with many diverse inputs, such as fuzz testing (fuzzing), robustness testing, and fault injection. The software's operation may slow down, but it should not become unstable, crash, or generate incorrect results.
Manual analysis can be useful for finding this weakness, but it might not achieve desired code coverage within limited time constraints. This becomes difficult for weaknesses that must be considered for all inputs, since the attack surface can be too large.
According to SOAR [REF-1479], the following detection techniques may be useful: ``` Highly cost effective: ``` Bytecode Weakness Analysis - including disassembler + source code weakness analysis Binary Weakness Analysis - including disassembler + source code weakness analysis
According to SOAR [REF-1479], the following detection techniques may be useful: ``` Cost effective for partial coverage: ``` Binary / Bytecode disassembler - then use manual analysis for vulnerabilities & anomalies
According to SOAR [REF-1479], the following detection techniques may be useful: ``` Cost effective for partial coverage: ``` Web Application Scanner Web Services Scanner Database Scanners
Plexicus auto-detects CWE-120 and opens a fix PR in under 60 seconds.
Codex Remedium scans every commit, identifies this exact weakness, and ships a reviewer-ready pull request with the patch. No tickets. No hand-offs.
Frequently asked questions
What is CWE-120?
This vulnerability occurs when a program copies data from one memory location to another without first verifying that the source data will fit within the destination buffer's allocated space.
How serious is CWE-120?
MITRE rates the likelihood of exploit as High — this weakness is actively exploited in the wild and should be prioritized for remediation.
What languages or platforms are affected by CWE-120?
MITRE lists the following affected platforms: C, C++, Assembly.
How can I prevent CWE-120?
Use a language that does not allow this weakness to occur or provides constructs that make this weakness easier to avoid. For example, many languages that perform their own memory management, such as Java and Perl, are not subject to buffer overflows. Other languages, such as Ada and C#, typically provide overflow protection, but the protection can be disabled by the programmer. Be wary that a language's interface to native code may still be subject to overflows, even if the language itself is…
How does Plexicus detect and fix CWE-120?
Plexicus's SAST engine matches the data-flow signature for CWE-120 on every commit. When a match is found, our Codex Remedium agent opens a fix PR with the corrected code, tests, and a one-line summary for the reviewer.
Where can I learn more about CWE-120?
MITRE publishes the canonical definition at https://cwe.mitre.org/data/definitions/120.html. You can also reference OWASP and NIST documentation for adjacent guidance.
Weaknesses related to CWE-120
Out-of-bounds Write
This vulnerability occurs when software incorrectly writes data outside the boundaries of its allocated memory buffer, either beyond the…
Stack-based Buffer Overflow
A stack-based buffer overflow occurs when a program writes more data to a buffer located on the call stack than it can hold, corrupting…
Heap-based Buffer Overflow
A heap-based buffer overflow occurs when a program writes more data to a memory buffer allocated in the heap than it can hold, corrupting…
Write-what-where Condition
A write-what-where condition occurs when an attacker can control both the data written and the exact memory location where it's written,…
Buffer Underwrite ('Buffer Underflow')
A buffer underwrite, also known as buffer underflow, happens when a program writes data to a memory location before the official start of…
Use of Path Manipulation Function without Maximum-sized Buffer
This vulnerability occurs when a program uses a path manipulation function but supplies an output buffer that is too small to hold the…
Further reading
- MITRE — official CWE-120 https://cwe.mitre.org/data/definitions/120.html
- Writing Secure Code https://www.microsoftpressstore.com/store/writing-secure-code-9780735617223
- Using the Strsafe.h Functions https://learn.microsoft.com/en-us/windows/win32/menurc/strsafe-ovw?redirectedfrom=MSDN
- Safe C String Library v1.0.3 http://www.gnu-darwin.org/www001/ports-1.5a-CURRENT/devel/safestr/work/safestr-1.0.3/doc/safestr.html
- Address Space Layout Randomization in Windows Vista https://learn.microsoft.com/en-us/archive/blogs/michael_howard/address-space-layout-randomization-in-windows-vista
- Limiting buffer overflows with ExecShield https://archive.is/saAFo
Don't Let Security
Weigh You Down.
Stop choosing between AI velocity and security debt. Plexicus is the only platform that runs Vibe Coding Security and ASPM in parallel — one workflow, every codebase.