OWASP - Invalid character in request (non printable characters)
Introduction
- The rule “Invalid character in request (non-printable characters)” is important to ensure that all input data is properly sanitized and validated to prevent potential security vulnerabilities such as injection attacks.
Steps to Handle Invalid Characters in Request
Input Validation
:- Validate all user inputs to ensure they only contain expected characters.
- Use regular expressions or predefined patterns to validate inputs.
Input Sanitization
:- Sanitize inputs to remove or escape any non-printable or potentially harmful characters before processing them.
Encoding
:- Ensure proper encoding of data, especially when dealing with different character sets.
- UTF-8 encoding is widely used and recommended for web applications.
Error Handling
:- Implement robust error handling to catch and handle any issues related to invalid characters gracefully, providing appropriate feedback to users.
Example of Input Validation and Sanitization
- Here’s an example in Python to validate and sanitize input strings:
1
2
3
4
5
6
7
8
9
10
11
12
13
import re
def validate_and_sanitize_input(input_str):
# Define a regular expression pattern for allowed characters (printable ASCII characters)
pattern = re.compile(r'^[\x20-\x7E]*$')
# Check if the input matches the pattern
if pattern.match(input_str):
return input_str
else:
# If the input contains invalid characters, remove non-printable characters
sanitized_str = re.sub(r'[^\x20-\x7E]', '', input_str)
return sanitized_str
Example usage
1
2
3
user_input = "Hello\x00World" # Contains a non-printable character (\x00)
clean_input = validate_and_sanitize_input(user_input)
print(clean_input) # Output: HelloWorld
OWASP Recommendations
- OWASP provides comprehensive guidelines for securing web applications, which include handling input validation and sanitization.
OWASP Input Validation Cheat Sheet
:- Provides detailed guidelines on how to validate and sanitize inputs effectively.
OWASP Secure Coding Practices
:- Offers a set of secure coding practices that include input validation and encoding techniques.
This post is licensed under CC BY 4.0 by the author.