dmsc/c/
validation.rs

1//! Copyright © 2025-2026 Wenze Wei. All Rights Reserved.
2//!
3//! This file is part of DMSC.
4//! The DMSC project belongs to the Dunimd Team.
5//!
6//! Licensed under the Apache License, Version 2.0 (the "License");
7//! You may not use this file except in compliance with the License.
8//! You may obtain a copy of the License at
9//!
10//!     http://www.apache.org/licenses/LICENSE-2.0
11//!
12//! Unless required by applicable law or agreed to in writing, software
13//! distributed under the License is distributed on an "AS IS" BASIS,
14//! WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
15//! See the License for the specific language governing permissions and
16//! limitations under the License.
17
18//! # Validation Module C API
19//!
20//! This module provides C language bindings for DMSC's validation and sanitization infrastructure. The
21//! validation module delivers comprehensive data validation, sanitization, and transformation capabilities
22//! for ensuring data integrity and security across the application. This C API enables C/C++ applications
23//! to leverage DMSC's validation functionality for building robust input handling, data transformation,
24//! and security enforcement layers.
25//!
26//! ## Module Architecture
27//!
28//! The validation module comprises three primary components that together provide complete validation
29//! and sanitization capabilities:
30//!
31//! - **DMSCValidationResult**: Result container for validation operations, encapsulating validation
32//!   outcomes including success/failure status, error messages, and detailed field-level validation
33//!   results. The result object provides comprehensive feedback about validation outcomes.
34//!
35//! - **DMSCValidatorBuilder**: Fluent builder interface for constructing complex validation rules.
36//!   The builder supports chaining multiple validation constraints, custom validation functions, and
37//!   conditional validation logic.
38//!
39//! - **DMSCSanitizer**: Sanitization engine for cleaning, normalizing, and transforming input data.
40//!   Sanitizers apply transformations to remove or neutralize potentially harmful content while preserving
41//!   valid data.
42//!
43//! ## Validation Types
44//!
45//! The validation system supports comprehensive data type validation:
46//!
47//! - **String Validation**: Length constraints, pattern matching, format validation, character set
48//!   restrictions, and Unicode normalization. Supports regex patterns, email formats, URLs, UUIDs,
49//!   and custom format specifications.
50//!
51//! - **Numeric Validation**: Range constraints, precision validation, integer/float differentiation,
52//!   divisibility rules, and comparison operators. Supports minimum/maximum values, exclusive/inclusive
53//!   bounds, and custom comparison logic.
54//!
55//! - **Boolean Validation**: Truthiness checks, explicit true/false requirements, and boolean string
56//!   parsing (true/false, yes/no, 1/0).
57//!
58//! - **Array/Collection Validation**: Length constraints, element type validation, uniqueness
59//!   requirements, duplicate detection, and sorted order verification.
60//!
61//! - **Object/Structure Validation**: Nested object validation, required field checks, conditional
62//!   field requirements, and dependency validation between fields.
63//!
64//! - **Date/Time Validation**: Format compliance, range constraints, timezone handling, and
65//!   temporal relationship validation (before/after, within duration).
66//!
67//! ## Validation Rules
68//!
69//! Built-in validation rules cover common requirements:
70//!
71//! - **Required Fields**: Non-empty, non-null validation with customizable empty value definitions.
72//!   Supports nested required field chains.
73//!
74//! - **Type Checking**: Compile-time and runtime type verification. Ensures data conforms to expected
75//!   types with automatic type coercion where enabled.
76//!
77//! - **Range Validation**: Minimum and maximum value constraints for numeric and comparable types.
78//!   Supports exclusive/inclusive bounds and custom comparison functions.
79//!
80//! - **Pattern Matching**: Regular expression validation for strings. Supports full match, partial
81//!   match, and capture group extraction.
82//!
83//! - **Format Validation**: Built-in format validators for common patterns including email addresses,
84//!   URLs, URIs, IP addresses (IPv4/IPv6), MAC addresses, credit card numbers, phone numbers,
85//!   postal codes, and ISO country/currency codes.
86//!
87//! - **Length Validation**: Minimum and maximum length constraints for strings and collections.
88//!   Supports byte length, character length, and grapheme cluster counting.
89//!
90//! - **Uniqueness Validation**: Ensures values are unique within a collection or against a data
91//!   source. Supports database-backed uniqueness checking.
92//!
93//! - **Comparison Validation**: Cross-field comparisons for equality, inequality, and relative
94//!   ordering. Validates that password matches confirmation, date ranges are valid, etc.
95//!
96//! ## Custom Validators
97//!
98//! The validation system supports custom validation logic:
99//!
100//! - **Custom Predicate Functions**: User-defined validation functions that take input value and
101//!   return validation result. Enables domain-specific validation rules.
102//!
103//! - **Callback Validators**: External validation function pointers for integrating with existing
104//!   validation libraries or business logic.
105//!
106//! - **Composition Validators**: Combine multiple validators using AND/OR/NOT logical operators.
107//!   Supports complex validation rule composition.
108//!
109//! - **Contextual Validators**: Validators that use additional context information for validation.
110//!   Enables validation that depends on system state or other data.
111//!
112//! ## Sanitization Features
113//!
114//! The sanitization engine provides comprehensive data cleaning:
115//!
116//! - **HTML Sanitization**: Remove or escape HTML tags while preserving safe content. Configurable
117//!   whitelist of allowed tags and attributes. Prevents XSS attacks in web contexts.
118//!
119//! - **SQL Injection Prevention**: Escape special characters in SQL queries. Supports parameterized
120//!   query generation. Prevents SQL injection attacks.
121//!
122//! - **Command Injection Prevention**: Sanitize input used in system commands. Remove dangerous
123//!   characters and escape shell metacharacters.
124//!
125//! - **XML Sanitization**: Validate and clean XML input. Remove dangerous entities and processing
126//!   instructions. Prevents XXE (XML External Entity) attacks.
127//!
128//! - **JSON Sanitization**: Validate JSON structure and escape special characters. Remove potentially
129//!   dangerous content while preserving valid JSON.
130//!
131//! - **Unicode Normalization**: Normalize Unicode strings to standard forms (NFC, NFD, NFKC, NFKD).
132//!   Prevents encoding-based attacks and ensures consistent string representation.
133//!
134//! - **Whitespace Handling**: Trim leading/trailing whitespace, collapse multiple spaces, and
135//!   normalize line endings. Configurable normalization rules.
136//!
137//! - **Character Filtering**: Remove or replace specific characters or character classes. Supports
138//!   Unicode character categories and custom character sets.
139//!
140//! ## Transformation Capabilities
141//!
142//! Built-in transformations modify data during validation:
143//!
144//! - **Type Coercion**: Automatically convert between compatible types. String to number, boolean
145//!   string parsing, date parsing from multiple formats.
146//!
147//! - **Case Conversion**: Transform string case (lowercase, uppercase, title case, sentence case).
148//!   Supports locale-aware case conversion.
149//!
150//! - **Truncation**: Limit string length with configurable behavior (cut at boundary, word boundary,
151//!   sentence boundary).
152//!
153//! - **Default Values**: Provide default values when input is missing or invalid. Supports conditional
154//!   default assignment based on other fields.
155//!
156//! - **Value Mapping**: Map input values to output values through lookup tables or functions.
157//!   Supports enum-like conversions and code normalization.
158//!
159//! - **Array Transformations**: Flatten nested arrays, filter empty elements, deduplicate, and
160//!   sort collections.
161//!
162//! ## Error Handling
163//!
164//! Comprehensive error handling provides detailed feedback:
165//!
166//! - **Error Codes**: Numeric error codes categorize validation failures for programmatic handling.
167//!   Standard codes for common validation errors.
168//!
169//! - **Error Messages**: Human-readable error messages in configurable languages. Supports message
170//!   templates with variable interpolation.
171//!
172//! - **Field Attribution**: Errors are attributed to specific fields in nested structures.
173//!   Provides complete path to invalid field.
174//!
175//! - **Error Details**: Additional context about validation failures including the rule that failed,
176//!   the invalid value, and expected constraints.
177//!
178//! - **Bail Behavior**: Option to stop validation at first error or collect all errors. Different
179//!   strategies for different use cases.
180//!
181//! ## Performance Characteristics
182//!
183//! Validation operations are optimized for various scenarios:
184//!
185//! - **Simple Validation**: O(1) to O(n) depending on constraint type
186//! - **Regex Validation**: O(n) where n is string length, optimized with automaton compilation
187//! - **Complex Composition**: O(total constraints) with short-circuit evaluation
188//! - **Custom Validators**: Performance depends on validator implementation
189//! - **Sanitization**: O(n) linear in input size with configurable passes
190//!
191//! ## Memory Management
192//!
193//! All C API objects use opaque pointers with manual memory management:
194//!
195//! - Constructor functions allocate new instances on the heap
196//! - Destructor functions must be called to release memory
197//! - Validation results contain allocated error messages
198//! - Validator builders manage internal rule state
199//!
200//! ## Thread Safety
201//!
202//! The underlying implementations have specific thread safety guarantees:
203//!
204//! - Validator builders are NOT thread-safe (mutable state during construction)
205//! - Compiled validators are immutable and thread-safe
206//! - Validation results are read-only after creation
207//! - Sanitizers are immutable after configuration
208//!
209//! ## Usage Example
210//!
211//! ```c
212//! // Create validation result for checking
213//! DMSCValidationResult* result = dmsc_validation_result_valid();
214//! if (result == NULL) {
215//!     fprintf(stderr, "Failed to create validation result\n");
216//!     return ERROR_INIT;
217//! }
218//!
219//! // Create validator builder
220//! DMSCValidatorBuilder* builder = dmsc_validator_builder_new();
221//! if (builder == NULL) {
222//!     fprintf(stderr, "Failed to create validator builder\n");
223//!     dmsc_validation_result_free(result);
224//!     return ERROR_INIT;
225//! }
226//!
227//! // Configure validation rules for a user struct
228//! dmsc_validator_builder_required(builder, "username");
229//! dmsc_validator_builder_required(builder, "email");
230//! dmsc_validator_builder_required(builder, "password");
231//!
232//! // String validation: username
233//! dmsc_validator_builder_string(builder, "username")
234//!     .min_length(builder, 3, "Username must be at least 3 characters")
235//!     .max_length(builder, 50, "Username must be at most 50 characters")
236//!     .pattern(builder, "^[a-zA-Z0-9_]+$", "Username can only contain alphanumeric characters and underscores")
237//!     .alphanumeric(builder, "Username must be alphanumeric");
238//!
239//! // Email validation with format checking
240//! dmsc_validator_builder_email(builder, "email", true)
241//!     .normalize(builder, true);
242//!
243//! // Password validation with complexity requirements
244//! dmsc_validator_builder_string(builder, "password")
245//!     .min_length(builder, 8, "Password must be at least 8 characters")
246//!     .regex(builder, ".*[A-Z].*", "Password must contain an uppercase letter")
247//!     .regex(builder, ".*[a-z].*", "Password must contain a lowercase letter")
248//!     .regex(builder, ".*[0-9].*", "Password must contain a number")
249//!     .regex(builder, ".*[!@#$%^&*].*", "Password must contain a special character");
250//!
251//! // Numeric validation: age
252//! dmsc_validator_builder_number(builder, "age")
253//!     .min(builder, 18, "User must be at least 18 years old")
254//!     .max(builder, 120, "Age must be realistic")
255//!     .integer(builder, true);
256//!
257//! // Array validation: roles
258//! dmsc_validator_builder_array(builder, "roles")
259//!     .min_length(builder, 1, "User must have at least one role")
260//!     .max_length(builder, 10, "User cannot have more than 10 roles")
261//!     .element_string(builder)
262//!         .in_list(builder, (char*[]){"admin", "user", "guest"}, 3, "Invalid role");
263//!
264//! // Conditional validation: admin email requires corporate domain
265//! dmsc_validator_builder_when(builder, "role", "admin")
266//!     .required(builder, "email")
267//!     .custom(builder, admin_email_validator, "Admin email must use corporate domain");
268//!
269//! // Build the validator
270//! DMSCValidator* validator = dmsc_validator_builder_build(builder);
271//! if (validator == NULL) {
272//!     fprintf(stderr, "Failed to build validator\n");
273//!     dmsc_validator_builder_free(builder);
274//!     dmsc_validation_result_free(result);
275//!     return ERROR_INIT;
276//! }
277//!
278//! // Example input data
279//! const char* input_data =
280//!     "{\"username\": \"john_doe\", \"email\": \"john@example.com\", "
281//!     "\"password\": \"SecurePass123!\", \"age\": 25, \"roles\": [\"user\"]}";
282//!
283//! // Validate the input
284//! int is_valid = dmsc_validator_validate(validator, input_data, strlen(input_data), result);
285//!
286//! if (is_valid) {
287//!     printf("Validation passed!\n");
288//!
289//!     // Get sanitized output
290//!     const char* sanitized = dmsc_validation_result_get_sanitized(result);
291//!     if (sanitized != NULL) {
292//!         printf("Sanitized: %s\n", sanitized);
293//!     }
294//! } else {
295//!     printf("Validation failed:\n");
296
297//!     // Get error count
298//!     int error_count = dmsc_validation_result_get_error_count(result);
299//!     printf("Number of errors: %d\n", error_count);
300
301//!     // Iterate through errors
302//!     for (int i = 0; i < error_count; i++) {
303//!         const char* field = dmsc_validation_result_get_error_field(result, i);
304//!         const char* message = dmsc_validation_result_get_error_message(result, i);
305//!         int code = dmsc_validation_result_get_error_code(result, i);
306//!
307//!         printf("  - Field '%s': %s (code: %d)\n", field, message, code);
308//!     }
309//!
310//!     // Check for specific error
311//!     if (dmsc_validation_result_has_error_code(result, ERROR_PASSWORD_WEAK)) {
312//!         printf("Password strength validation failed\n");
313//!     }
314//! }
315//!
316//! // Sanitize input separately
317//! DMSCSanitizer* sanitizer = dmsc_sanitizer_new();
318//! if (sanitizer == NULL) {
319//!     fprintf(stderr, "Failed to create sanitizer\n");
320//!     dmsc_validator_free(validator);
321//!     dmsc_validator_builder_free(builder);
322//!     dmsc_validation_result_free(result);
323//!     return ERROR_INIT;
324//! }
325//!
326//! // Configure sanitization
327//! dmsc_sanitizer_trim(sanitizer, true);
328//! dmsc_sanitizer_collapse_whitespace(sanitizer, true);
329//! dmsc_sanitizer_remove_control_chars(sanitizer, true);
330//! dmsc_sanitizer_normalize_unicode(sanitizer, NFC);
331//!
332//! // Apply sanitization
333//! const char* dirty_input = "  Hello   World\t\n";
334//! char* clean_output = NULL;
335//!
336//! int sanitize_result = dmsc_sanitizer_sanitize(sanitizer, dirty_input, strlen(dirty_input), &clean_output);
337//!
338//! if (sanitize_result == 0 && clean_output != NULL) {
339//!     printf("Sanitized: '%s'\n", clean_output);
340//!     dmsc_string_free(clean_output);
341//! }
342//!
343//! // HTML sanitization for web content
344//! dmsc_sanitizer_html_allowed_tags(sanitizer, (char*[]){"p", "br", "b", "i", "a"}, 5);
345//! dmsc_sanitizer_html_allowed_attributes(sanitizer, (char*[]){"href", "title"}, 2);
346//!
347//! const char* html_input = "<p>Hello <script>alert('xss')</script></p>";
348//! clean_output = NULL;
349//!
350//! sanitize_result = dmsc_sanitizer_sanitize_html(sanitizer, html_input, strlen(html_input), &clean_output);
351//!
352//! if (sanitize_result == 0 && clean_output != NULL) {
353//!     printf("HTML Sanitized: %s\n", clean_output);  // Output: <p>Hello </p>
354//!     dmsc_string_free(clean_output);
355//! }
356//!
357//! // Cleanup
358//! dmsc_sanitizer_free(sanitizer);
359//! dmsc_validator_free(validator);
360//! dmsc_validator_builder_free(builder);
361//! dmsc_validation_result_free(result);
362//!
363//! printf("Validation example complete\n");
364//! ```
365//!
366//! ## Validator Builder Methods
367//!
368//! The validator builder provides a fluent interface:
369//!
370//! ```c
371//! // Type-specific builders
372//! dmsc_validator_builder_string(builder, field_name)
373//!     .min_length(builder, min, message)
374//!     .max_length(builder, max, message)
375//!     .pattern(builder, regex, message)
376//!     .email(builder, strict)
377//!     .url(builder)
378//!     .uuid(builder)
379//!     .alphanumeric(builder, message)
380//!     .alpha(builder, message)
381//!     .numeric(builder, message)
382//!     .lowercase(builder)
383//!     .uppercase(builder)
384//!     .trim(builder)
385//!     .normalize(builder, form);
386//!
387//! dmsc_validator_builder_number(builder, field_name)
388//!     .min(builder, value, message)
389//!     .max(builder, value, message)
390//!     .positive(builder, message)
391//!     .negative(builder, message)
392//!     .range(builder, min, max, message)
393//!     .integer(builder, strict)
394//!     .precision(builder, max_decimals);
395//!
396//! dmsc_validator_builder_boolean(builder, field_name)
397//!     .truthy(builder, true_values, count)
398//!     .falsy(builder, false_values, count);
399//!
400//! dmsc_validator_builder_array(builder, field_name)
401//!     .min_length(builder, min, message)
402//!     .max_length(builder, max, message)
403//!     .unique(builder, message)
404//!     .sorted(builder, ascending)
405//!     .element_type(builder, element_validator);
406//!
407//! dmsc_validator_builder_object(builder, field_name)
408//!     .required(builder, nested_field)
409//!     .optional(builder, nested_field)
410//!     .nested(builder, nested_validator);
411//! ```
412//!
413//! ## Dependencies
414//!
415//! This module depends on the following DMSC components:
416//!
417//! - `crate::validation`: Rust validation module implementation
418//! - `crate::prelude`: Common types and traits
419//! - regex for pattern matching
420//! - Unicode normalization (unicode-normalization crate)
421//! - HTML5 spec for HTML sanitization
422//!
423//! ## Feature Flags
424//!
425//! The validation module is enabled by default.
426//! Disable this feature to reduce binary size when validation is not required.
427//!
428//! Additional features:
429//!
430//! - `validation-html`: Enable HTML sanitization
431//! - `validation-email`: Enable email format validation with DNS checks
432//! - `validation-phone`: Enable phone number validation
433//! - `validation-i18n`: Enable internationalization support
434
435use crate::validation::{DMSCSanitizer, DMSCValidationResult, DMSCValidatorBuilder};
436
437
438c_wrapper!(CDMSCValidationResult, DMSCValidationResult);
439c_wrapper!(CDMSCValidatorBuilder, DMSCValidatorBuilder);
440c_wrapper!(CDMSCSanitizer, DMSCSanitizer);
441
442// DMSCValidationResult constructors and destructors
443#[no_mangle]
444pub extern "C" fn dmsc_validation_result_valid() -> *mut CDMSCValidationResult {
445    let result = DMSCValidationResult::valid();
446    Box::into_raw(Box::new(CDMSCValidationResult::new(result)))
447}
448c_destructor!(dmsc_validation_result_free, CDMSCValidationResult);