Compare commits

...

15 Commits

Author SHA1 Message Date
zzy
d2eaf2247f build: 重命名许可证文件名
将文件名从 LISENCE 更正为 LICENSE,修正拼写错误。
2026-02-23 16:44:31 +08:00
zzy
51869bf081 feat(pproc): 改进宏处理器以支持括号嵌套和GNU扩展
- 实现了括号深度跟踪来正确分割带括号的宏参数
- 添加了对 GNU 扩展中 `##` 操作符逗号删除的支持
- 新增辅助函数 `got_left_non_blank` 和 `got_right_non_blank`
  来优化查找非空白 token 的逻辑
- 改进了错误消息以显示预期但得到的实际值类型

fix(pproc): 修复条件编译和包含文件路径的错误消息

- 在 `scc_pproc_parse_if_condition` 中改进错误消息格式
- 修复 `switch_file_stack` 函数中的日志字符串格式问题

test(pproc): 添加宏处理相关的单元测试

- 增加了连接操作符、嵌套宏、括号处理等测试用例
- 添加了 C99 标准示例和 GNU 变参宏删除逗号的测试
- 包含了复杂的宏展开场景测试

chore(justfile): 更新构建脚本添加调试目标

- 为 `test-scc` 目标添加了 `debug-scc` 调试版本
- 更新构建命令以支持开发模式

feat(cbuild): 添加 dry-run 模式和改进编译器参数

- 为编译器类添加 dry-run 功能,只打印命令不执行
- 改进 scc 编译器的包含路径处理逻辑
- 为命令行解析器添加 dry-run 参数选项

refactor(log): 重命名 static_assert 为 StaticAssert 避免冲突

- 为了避免与标准库冲突,将自定义 static_assert 重命名为 StaticAssert

style(scc_core): 移除未使用的预定义宏定义

- 删除了不再需要的基础类型前缀宏定义

fix(scc_core): 初始化 ring 测试中的未初始化变量

- 为测试函数中的字符变量添加初始化值避免未定义行为
2026-02-21 23:53:44 +08:00
zzy
3b2f68111e chore(license): 添加MIT许可证文件
向项目中添加MIT许可证,明确软件使用的条款和条件,
包括版权声明、许可声明以及免责声明等内容。
2026-02-21 16:46:38 +08:00
zzy
4940b652eb feat(cbuild): 添加SCC编译器支持并更新构建脚本
添加了新的SccCompiler类以支持SCC编译器,包括编译和链接功能。
更新了justfile中的构建任务,添加了clean、build、build-install和test-scc等新任务。
同时修复了异常处理语法错误并将wc.py工具文件移除。
2026-02-21 16:02:53 +08:00
zzy
8007825800 feat(pproc): 实现预处理器条件编译和可变参数宏支持
- 添加了完整的条件编译功能,包括 #if、#elif、#else、#endif 指令
- 实现了数值常量表达式的解析和求值
- 支持嵌套条件编译和与其他指令混合使用
- 实现了可变参数宏定义和 __VA_ARGS__ 替换功能
- 改进了宏展开机制以正确处理可变参数宏
- 重构了预处理器指令处理逻辑,提高了代码可维护性
- 添加了相应的单元测试用例验证新功能
2026-02-21 15:59:31 +08:00
zzy
b705e5d0ad feat(argparse): 添加列表类型参数支持
新增 scc_argparse_list_t 类型用于处理多个字符串值的参数,
并添加相应的配置函数 scc_argparse_spec_setup_list。

fix(lexer): 调整关键字标记处理逻辑

将关键字子类型从 SCC_TOK_SUBTYPE_KEYWORD 改为
SCC_TOK_SUBTYPE_IDENTIFIER,并移除相关的枚举定义。

refactor(lexer): 优化跳过换行功能实现

修改 scc_lexer_skip_until_newline 函数实现,改进循环逻辑和错误处理。

feat(pproc): 完善预处理器条件编译支持

重构条件编译状态管理,添加对 #ifdef、#ifndef、#elifdef、
#else、#endif 等指令的支持,并实现嵌套条件处理。

refactor(pproc): 优化文件包含路径处理

添加最大包含深度限制,改进包含路径添加功能,
修复文件状态结构命名。

docs(log): 更新日志模块标准库依赖

调整 stdarg.h 包含逻辑,更新编译器内置宏定义名称。

feat(core): 扩展核心类型定义

添加基础数据类型别名定义,完善类型系统支持。

feat(main): 实现命令行包含路径参数

添加 -I/--include 参数支持,允许用户指定额外的头文件搜索路径。
2026-02-21 10:46:49 +08:00
zzy
9c2b4db22a feat(pproc): 修改include解析函数以支持位置信息传递
修改了scc_pproc_parse_include函数签名,添加了token参数用于传递位置信息,
使得在处理包含指令时能够提供更准确的错误定位。同时更新了文件切换逻辑,
将位置信息传递给错误日志,提高调试效率。
2026-02-20 14:28:11 +08:00
zzy
bc0b1d23e3 feat(pproc): 添加预处理器包含路径支持和改进头文件查找逻辑
添加了新的类型定义 scc_pproc_cstr_vec_t 用于存储包含路径,
并在 scc_pproc 结构中添加 include_paths 字段。实现改进的
switch_file_stack 函数,支持从当前目录、父目录和系统包含路径
中查找头文件,提供更完整的 #include 指令处理能力。

fix(core): 重命名环形缓冲区内联宏避免命名冲突

将 scc_ring_phys 宏重命名为 _scc_ring_phys,并添加其他相关
内部宏如 _scc_ring_cap、_scc_ring_head 等,以避免与外部接口
的命名冲突并提高代码清晰度。

refactor(main): 添加命令行包含路径选项并清理标准库引用

在命令行参数解析中添加 -I/--include 选项支持,允许用户指定
额外的头文件搜索路径。同时移除不必要的 stdio.h 引用并清理
一些调试相关的缓冲区设置。
2026-02-19 19:30:00 +08:00
zzy
a52ff33e30 feat(ast): 更新AST字面量表示方式
更新AST定义以使用词素字符串代替常量值,
并修改AST转储功能以正确显示字面量内容。

BREAKING CHANGE: AST表达式结构体中literal成员从value改为lexme字段。

refactor(pproc): 重构宏展开和文件包含逻辑

将宏展开函数重构为独立接口,实现文件包含处理逻辑,
改进预处理器的状态管理机制。

fix(sstream): 修复文件流初始化错误码返回

修正文件打开失败时的错误码返回值,确保调用方能正确处理异常情况。
2026-02-19 15:56:05 +08:00
zzy
27a87d17ab feat(lexer): 改进预处理器token测试用例并修复##符号处理
- 将"##" token从SCC_TOK_SHARP修正为SCC_TOK_SHARP_SHARP
- 添加更多预处理器指令测试用例,包括宏定义、错误和警告指令
- 修正序列测试中的##符号处理

fix(pproc): 完善预处理器指令处理逻辑

- 实现#error和#warning指令的具体处理逻辑
- 添加对字符串字面量的错误和警告消息输出
- 优化未处理指令的错误处理流程

fix(pproc): 修复词法分析器流处理边界条件

- 在scc_pproc.c中添加对token获取失败的检查
- 防止在流结束时出现未处理的边界情况
2026-02-19 12:14:56 +08:00
zzy
08a60e6e8a feat: 添加预处理器宏定义的字符串化和连接操作支持
- 实现了 # 和 ## 预处理器操作符的功能
- 添加了 token 深拷贝和移动函数以支持宏展开
- 修改预处理器展开逻辑以正确处理宏参数替换
- 增加了宏参数分割时对空白字符的处理

fix: 修复预处理器宏展开中的内存管理和逻辑错误

- 修正了宏展开集合的数据结构初始化方式
- 修复了函数式宏调用时括号匹配的判断逻辑
- 改进了宏参数解析过程中空白字符的处理
- 解决了 token 在宏展开过程中的所有权管理问题

chore: 为 justfile 添加文件统计命令并优化构建配置

- 新增 count-file 命令用于统计代码文件数量
- 调整了输出文件的默认命名规则
- 优化了词法分析器 token 释放时的字段重置逻辑
2026-02-19 11:20:01 +08:00
zzy
c86071416d feat(pproc): 实现宏展开功能并重构宏定义接口
- 新增 pproc_expand.h 头文件,定义宏展开相关的数据结构和函数接口
- 重命名宏相关类型和函数,将 scc_pp_* 前缀统一改为 scc_pproc_*
- 修改宏参数解析逻辑,支持更灵活的参数处理方式
- 实现完整的宏展开功能,包括对象宏和函数宏的展开
- 添加字符串化操作符 (#) 的支持
- 改进预处理器主循环逻辑,优化宏展开流程
- 更新单元测试用例,增加对宏参数解析和字符串化的测试
2026-02-18 18:18:57 +08:00
zzy
9d85dc130d feat(lexer): 添加词法分析器对##操作符的支持
- 重命名lexer_token.h为scc_lexer_token.h以保持命名一致性
- 在词法分析器中实现##操作符的识别和处理
- 修改头文件包含路径和类型定义的位置
- 修复token结构体定义的顺序问题

fix(lexer): 初始化lexer中的cur变量避免未初始化问题

- 在scc_lexer_get_token函数中初始化scc_sstream_char_t cur变量

refactor(core): 增强ring缓冲区功能并添加cstring比较函数

- 在scc_core_ring.h中添加空值检查防止fill函数为空时崩溃
- 添加scc_ring_by_buffer宏用于通过缓冲区创建ring实例
- 在scc_core_str.h中添加scc_cstring_cmp函数用于字符串比较
2026-02-18 18:17:52 +08:00
zzy
2de5ae59f5 feat(pproc): 实现C语言预处理器功能并重构项目依赖
- 新增预处理器库(pproc),替代原有的pprocessor模块
- 实现完整的宏定义解析功能,支持对象宏和函数宏
- 添加条件编译指令处理(#if、#ifdef、#ifndef、#else、#elif、#endif)
- 实现宏展开机制,包括嵌套宏和递归宏处理
- 添加宏定义测试用例,覆盖基本功能和复杂场景
- 在cbuild.toml中更新依赖配置,移除parser、ast、ast2ir、ir等未完成模块
- 新增lexer工具函数用于token流处理
- 添加宏定义表管理功能,支持宏的创建、查找、删除操作
- 实现宏参数解析和替换列表处理
2026-02-17 22:47:25 +08:00
zzy
681a15cb44 feat(lexer): 添加预处理器关键字支持并优化词法分析器
添加了完整的C预处理器关键字表,包括define、include、ifdef等关键字,
用于支持预处理器功能。

- 新增SCC_PPKEYWORD_TABLE宏定义所有预处理器关键字
- 在token类型枚举中包含预处理关键字
- 重构词法分析器以正确识别预处理关键字
- 添加scc_lexer_tok_drop函数用于清理token资源

refactor(lexer): 重构词法分析器内部结构

- 修复keywords数组字段名从tok到tok_type
- 优化scc_lexer_get_valid_token使用while循环替代do-while
- 修改fill_token和fill_valid_token返回类型为cbool
- 调整lexer_to_ring参数语义更清晰

fix(sstream): 修正环形缓冲区填充函数返回类型

- 将fill_func返回类型从int改为cbool以保持一致性
- 更新SCC_RING宏文档说明fill回调函数返回值含义

docs(argparse): 重命名examples目录修复路径错误

- 将libs/argparse/example重命名为libs/argparse/examples保持一致性

test(lexer): 更新测试用例适配新的流接口

- 修改测试代码中的scc_sstream_ref_ring为scc_sstream_to_ring
- 确保测试用例与新的API保持兼容

style(lexer): 更新示例程序日志级别和实现方式

- 将调试日志改为信息日志
- 使用环形缓冲区实现示例程序的token获取
2026-02-16 22:27:09 +08:00
35 changed files with 2829 additions and 293 deletions

8
LICENSE Normal file
View File

@@ -0,0 +1,8 @@
The MIT License (MIT)
Copyright © 2026 <copyright holders>
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

View File

@@ -4,10 +4,10 @@ version = "0.1.0"
dependencies = [ dependencies = [
{ name = "argparse", path = "./libs/argparse" }, { name = "argparse", path = "./libs/argparse" },
{ name = "pprocessor", path = "./libs/pprocessor" },
{ name = "lexer", path = "./libs/lexer" }, { name = "lexer", path = "./libs/lexer" },
{ name = "parser", path = "./libs/parser" }, { name = "pproc", path = "./libs/pproc" },
{ name = "ast", path = "./libs/ast" }, # { name = "parser", path = "./libs/parser" },
{ name = "ast2ir", path = "./libs/ast2ir" }, # { name = "ast", path = "./libs/ast" },
{ name = "ir", path = "./libs/ir" }, # { name = "ast2ir", path = "./libs/ast2ir" },
# { name = "ir", path = "./libs/ir" },
] ]

View File

@@ -11,5 +11,29 @@ count:
# you need download `tokei` it can download by cargo # you need download `tokei` it can download by cargo
tokei libs runtime src -e tests tokei libs runtime src -e tests
build_lexer: count-file:
python ./tools/cbuild/cbuild.py --path libs/lexer build # you need download `tokei` it can download by cargo
tokei libs runtime src -e tests --files
clean:
cbuild clean
build:
cbuild build -cclang
build-install: build
cp ./build/dev/scc.exe ./scc.exe
test-scc:
# windows: (Get-Content a.txt -Raw) -replace '\x1b\[[0-9;]*[a-zA-Z]', '' | Set-Content clean.txt
# linux: sed 's/\x1b\[[0-9;]*[a-zA-Z]//g' a.txt > clean.txt
just clean
just build-install
just clean
cbuild build -cscc
debug-scc:
just clean
just build-install
just clean
cbuild build -cscc --dev --record --dry-run

View File

@@ -43,6 +43,7 @@ typedef enum scc_argparse_err {
SCC_ARGPARSE_ERR_UNKNOWN_VALUE, SCC_ARGPARSE_ERR_UNKNOWN_VALUE,
} scc_argparse_err_t; } scc_argparse_err_t;
typedef SCC_VEC(const char *) scc_argparse_list_t;
// 约束规范结构体 // 约束规范结构体
struct scc_argparse_spec { struct scc_argparse_spec {
scc_argparse_val_type_t value_type; // 值类型 scc_argparse_val_type_t value_type; // 值类型
@@ -56,6 +57,7 @@ struct scc_argparse_spec {
const char **str_store; // 字符串存储 const char **str_store; // 字符串存储
char **str_alloc_store; // 字符串存储使用alloc需要free char **str_alloc_store; // 字符串存储使用alloc需要free
void **ptr_store; // 通用指针存储 void **ptr_store; // 通用指针存储
scc_argparse_list_t *vec_store; // 新增:指向字符串向量的指针
} store; } store;
// 枚举值约束 // 枚举值约束
@@ -215,6 +217,15 @@ static inline void scc_argparse_spec_setup_choices(scc_argparse_spec_t *spec,
spec->choices.count = count; spec->choices.count = count;
} }
// 添加设置列表的辅助函数(内联)
static inline void scc_argparse_spec_setup_list(scc_argparse_spec_t *spec,
scc_argparse_list_t *vec) {
spec->value_type = SCC_ARGPARSE_VAL_TYPE_LIST;
spec->store.vec_store = vec;
// 自动设置 flag_takes_multiple 为 true因为列表需要多个值
spec->flag_takes_multiple = true;
}
#define SCC_ARGPARSE_MACRO_SETTER(attr) \ #define SCC_ARGPARSE_MACRO_SETTER(attr) \
static inline void scc_argparse_spec_set_##attr(scc_argparse_spec_t *spec, \ static inline void scc_argparse_spec_set_##attr(scc_argparse_spec_t *spec, \
cbool flag) { \ cbool flag) { \

View File

@@ -63,6 +63,10 @@ static inline void parse_cmd(scc_optparse_t *optparse,
min_args = 0; min_args = 0;
max_args = 0; max_args = 0;
break; break;
case SCC_ARGPARSE_VAL_TYPE_LIST: // 列表类型
min_args = 1;
max_args = 65535; // FIXME maybe INT_MAX ?
break;
default: default:
min_args = 0; min_args = 0;
max_args = 0; max_args = 0;
@@ -161,6 +165,10 @@ static int handle_option(scc_argparse_context_t *ctx, scc_argparse_t *parser) {
return SCC_ARGPARSE_ERR_PNT_DEFAULT; return SCC_ARGPARSE_ERR_PNT_DEFAULT;
} }
if (parser->need_version && scc_strcmp(opt->long_name, "version") == 0) {
// TODO default version print
}
if (opt->spec.flag_store_as_count) { if (opt->spec.flag_store_as_count) {
(*opt->spec.store.int_store)++; (*opt->spec.store.int_store)++;
} }
@@ -169,8 +177,14 @@ static int handle_option(scc_argparse_context_t *ctx, scc_argparse_t *parser) {
*opt->spec.store.bool_store = true; *opt->spec.store.bool_store = true;
} }
if (ctx->result.value) { if (!ctx->result.value) {
return SCC_ARGPARSE_ERR_NONE;
}
opt->spec.raw_value = ctx->result.value; opt->spec.raw_value = ctx->result.value;
if (opt->spec.value_type == SCC_ARGPARSE_VAL_TYPE_LIST) {
scc_vec_push(*opt->spec.store.vec_store, ctx->result.value);
} else {
*opt->spec.store.str_store = ctx->result.value; *opt->spec.store.str_store = ctx->result.value;
} }

View File

@@ -2,6 +2,7 @@
#define __SCC_AST_DEF_H__ #define __SCC_AST_DEF_H__
#include <scc_core.h> #include <scc_core.h>
#include <scc_pos.h>
/** /**
* @brief AST 节点类型枚举 * @brief AST 节点类型枚举
@@ -310,7 +311,7 @@ struct scc_ast_expr {
} compound_literal; } compound_literal;
// 字面量 // 字面量
struct { struct {
scc_cvalue_t value; scc_cstring_t lexme;
} literal; } literal;
// 标识符 // 标识符
struct { struct {

View File

@@ -392,16 +392,10 @@ static void dump_expr_impl(scc_ast_expr_t *expr, scc_tree_dump_ctx_t *ctx) {
PRINT_QUOTED_VALUE(ctx, get_op_str(expr->unary.op)); PRINT_QUOTED_VALUE(ctx, get_op_str(expr->unary.op));
break; break;
case SCC_AST_EXPR_INT_LITERAL: case SCC_AST_EXPR_INT_LITERAL:
PRINT_VALUE(ctx, " %lld", expr->literal.value.i);
break;
case SCC_AST_EXPR_FLOAT_LITERAL: case SCC_AST_EXPR_FLOAT_LITERAL:
PRINT_VALUE(ctx, " %f", expr->literal.value.f);
break;
case SCC_AST_EXPR_CHAR_LITERAL: case SCC_AST_EXPR_CHAR_LITERAL:
PRINT_VALUE(ctx, " '%c'", (char)expr->literal.value.ch);
break;
case SCC_AST_EXPR_STRING_LITERAL: case SCC_AST_EXPR_STRING_LITERAL:
PRINT_VALUE(ctx, " \"%s\"", expr->literal.value.cstr.data); PRINT_VALUE(ctx, " %s", expr->literal.lexme);
break; break;
case SCC_AST_EXPR_IDENTIFIER: case SCC_AST_EXPR_IDENTIFIER:
if (expr->identifier.name) { if (expr->identifier.name) {

View File

@@ -6,13 +6,11 @@
#ifndef __SCC_LEXER_H__ #ifndef __SCC_LEXER_H__
#define __SCC_LEXER_H__ #define __SCC_LEXER_H__
#include "lexer_token.h" #include "scc_lexer_token.h"
#include <scc_core.h> #include <scc_core.h>
#include <scc_core_ring.h> #include <scc_core_ring.h>
#include <scc_sstream.h> #include <scc_sstream.h>
typedef SCC_RING(scc_lexer_tok_t) scc_lexer_tok_ring_t;
typedef SCC_VEC(scc_lexer_tok_t) scc_lexer_tok_vec_t;
/** /**
* @brief 词法分析器核心结构体 * @brief 词法分析器核心结构体
* *

View File

@@ -4,54 +4,82 @@
#include <scc_core.h> #include <scc_core.h>
#include <scc_pos.h> #include <scc_pos.h>
#include <scc_core_ring.h>
struct scc_lexer_token;
typedef struct scc_lexer_token scc_lexer_tok_t;
typedef SCC_RING(scc_lexer_tok_t) scc_lexer_tok_ring_t;
typedef SCC_VEC(scc_lexer_tok_t) scc_lexer_tok_vec_t;
typedef enum scc_cstd { typedef enum scc_cstd {
SCC_CSTD_C89, SCC_CSTD_C89,
SCC_CSTD_C99, SCC_CSTD_C99,
SCC_CEXT_SCC, SCC_CEXT_SCC,
} scc_cstd_t; } scc_cstd_t;
/* clang-format off */
/// https://cppreference.cn/w/c/preprocessor
#define SCC_PPKEYWORD_TABLE \
X(define , SCC_CSTD_C99, SCC_PP_TOK_DEFINE ) \
X(elif , SCC_CSTD_C99, SCC_PP_TOK_ELIF ) \
X(elifdef , SCC_CSTD_C99, SCC_PP_TOK_ELIFDEF ) \
X(elifndef , SCC_CSTD_C99, SCC_PP_TOK_ELIFNDEF ) \
X(else , SCC_CSTD_C99, SCC_PP_TOK_ELSE ) \
X(embed , SCC_CSTD_C99, SCC_PP_TOK_EMBED ) \
X(endif , SCC_CSTD_C99, SCC_PP_TOK_ENDIF ) \
X(error , SCC_CSTD_C99, SCC_PP_TOK_ERROR ) \
X(if , SCC_CSTD_C99, SCC_PP_TOK_IF ) \
X(ifdef , SCC_CEXT_SCC, SCC_PP_TOK_IFDEF ) \
X(ifndef , SCC_CSTD_C99, SCC_PP_TOK_IFNDEF ) \
X(include , SCC_CSTD_C99, SCC_PP_TOK_INCLUDE ) \
X(line , SCC_CEXT_SCC, SCC_PP_TOK_LINE ) \
X(pragma , SCC_CSTD_C99, SCC_PP_TOK_PRAGMA ) \
X(undef , SCC_CEXT_SCC, SCC_PP_TOK_UNDEF ) \
X(warning , SCC_CSTD_C99, SCC_PP_TOK_WARNING ) \
// END
/* clang-format on */
/* clang-format off */ /* clang-format off */
// WARNING: Using Binary Search To Fast Find Keyword // WARNING: Using Binary Search To Fast Find Keyword
// 你必须确保其中是按照字典序排列 // 你必须确保其中是按照字典序排列
#define SCC_CKEYWORD_TABLE \ #define SCC_CKEYWORD_TABLE \
X(asm , SCC_TOK_SUBTYPE_KEYWORD , SCC_TOK_ASM , SCC_CEXT_SCC) \ X(asm , SCC_TOK_SUBTYPE_IDENTIFIER , SCC_TOK_ASM , SCC_CEXT_SCC) \
X(atomic , SCC_TOK_SUBTYPE_KEYWORD , SCC_TOK_ATOMIC , SCC_CEXT_SCC) \ X(atomic , SCC_TOK_SUBTYPE_IDENTIFIER , SCC_TOK_ATOMIC , SCC_CEXT_SCC) \
X(auto , SCC_TOK_SUBTYPE_KEYWORD , SCC_TOK_AUTO , SCC_CEXT_SCC) \ X(auto , SCC_TOK_SUBTYPE_IDENTIFIER , SCC_TOK_AUTO , SCC_CEXT_SCC) \
X(bool , SCC_TOK_SUBTYPE_KEYWORD , SCC_TOK_BOOL , SCC_CEXT_SCC) \ X(bool , SCC_TOK_SUBTYPE_IDENTIFIER , SCC_TOK_BOOL , SCC_CEXT_SCC) \
X(break , SCC_TOK_SUBTYPE_KEYWORD , SCC_TOK_BREAK , SCC_CSTD_C89) \ X(break , SCC_TOK_SUBTYPE_IDENTIFIER , SCC_TOK_BREAK , SCC_CSTD_C89) \
X(case , SCC_TOK_SUBTYPE_KEYWORD , SCC_TOK_CASE , SCC_CSTD_C89) \ X(case , SCC_TOK_SUBTYPE_IDENTIFIER , SCC_TOK_CASE , SCC_CSTD_C89) \
X(char , SCC_TOK_SUBTYPE_KEYWORD , SCC_TOK_CHAR , SCC_CSTD_C89) \ X(char , SCC_TOK_SUBTYPE_IDENTIFIER , SCC_TOK_CHAR , SCC_CSTD_C89) \
X(complex , SCC_TOK_SUBTYPE_KEYWORD , SCC_TOK_COMPLEX , SCC_CEXT_SCC) \ X(complex , SCC_TOK_SUBTYPE_IDENTIFIER , SCC_TOK_COMPLEX , SCC_CEXT_SCC) \
X(const , SCC_TOK_SUBTYPE_KEYWORD , SCC_TOK_CONST , SCC_CSTD_C89) \ X(const , SCC_TOK_SUBTYPE_IDENTIFIER , SCC_TOK_CONST , SCC_CSTD_C89) \
X(continue , SCC_TOK_SUBTYPE_KEYWORD , SCC_TOK_CONTINUE , SCC_CSTD_C89) \ X(continue , SCC_TOK_SUBTYPE_IDENTIFIER , SCC_TOK_CONTINUE , SCC_CSTD_C89) \
X(default , SCC_TOK_SUBTYPE_KEYWORD , SCC_TOK_DEFAULT , SCC_CSTD_C89) \ X(default , SCC_TOK_SUBTYPE_IDENTIFIER , SCC_TOK_DEFAULT , SCC_CSTD_C89) \
X(do , SCC_TOK_SUBTYPE_KEYWORD , SCC_TOK_DO , SCC_CSTD_C89) \ X(do , SCC_TOK_SUBTYPE_IDENTIFIER , SCC_TOK_DO , SCC_CSTD_C89) \
X(double , SCC_TOK_SUBTYPE_KEYWORD , SCC_TOK_DOUBLE , SCC_CSTD_C89) \ X(double , SCC_TOK_SUBTYPE_IDENTIFIER , SCC_TOK_DOUBLE , SCC_CSTD_C89) \
X(else , SCC_TOK_SUBTYPE_KEYWORD , SCC_TOK_ELSE , SCC_CSTD_C89) \ X(else , SCC_TOK_SUBTYPE_IDENTIFIER , SCC_TOK_ELSE , SCC_CSTD_C89) \
X(enum , SCC_TOK_SUBTYPE_KEYWORD , SCC_TOK_ENUM , SCC_CSTD_C89) \ X(enum , SCC_TOK_SUBTYPE_IDENTIFIER , SCC_TOK_ENUM , SCC_CSTD_C89) \
X(extern , SCC_TOK_SUBTYPE_KEYWORD , SCC_TOK_EXTERN , SCC_CSTD_C89) \ X(extern , SCC_TOK_SUBTYPE_IDENTIFIER , SCC_TOK_EXTERN , SCC_CSTD_C89) \
X(float , SCC_TOK_SUBTYPE_KEYWORD , SCC_TOK_FLOAT , SCC_CSTD_C89) \ X(float , SCC_TOK_SUBTYPE_IDENTIFIER , SCC_TOK_FLOAT , SCC_CSTD_C89) \
X(for , SCC_TOK_SUBTYPE_KEYWORD , SCC_TOK_FOR , SCC_CSTD_C89) \ X(for , SCC_TOK_SUBTYPE_IDENTIFIER , SCC_TOK_FOR , SCC_CSTD_C89) \
X(goto , SCC_TOK_SUBTYPE_KEYWORD , SCC_TOK_GOTO , SCC_CSTD_C89) \ X(goto , SCC_TOK_SUBTYPE_IDENTIFIER , SCC_TOK_GOTO , SCC_CSTD_C89) \
X(if , SCC_TOK_SUBTYPE_KEYWORD , SCC_TOK_IF , SCC_CSTD_C89) \ X(if , SCC_TOK_SUBTYPE_IDENTIFIER , SCC_TOK_IF , SCC_CSTD_C89) \
X(inline , SCC_TOK_SUBTYPE_KEYWORD , SCC_TOK_INLINE , SCC_CSTD_C99) \ X(inline , SCC_TOK_SUBTYPE_IDENTIFIER , SCC_TOK_INLINE , SCC_CSTD_C99) \
X(int , SCC_TOK_SUBTYPE_KEYWORD , SCC_TOK_INT , SCC_CSTD_C89) \ X(int , SCC_TOK_SUBTYPE_IDENTIFIER , SCC_TOK_INT , SCC_CSTD_C89) \
X(long , SCC_TOK_SUBTYPE_KEYWORD , SCC_TOK_LONG , SCC_CSTD_C89) \ X(long , SCC_TOK_SUBTYPE_IDENTIFIER , SCC_TOK_LONG , SCC_CSTD_C89) \
X(register , SCC_TOK_SUBTYPE_KEYWORD , SCC_TOK_REGISTER , SCC_CSTD_C89) \ X(register , SCC_TOK_SUBTYPE_IDENTIFIER , SCC_TOK_REGISTER , SCC_CSTD_C89) \
X(restrict , SCC_TOK_SUBTYPE_KEYWORD , SCC_TOK_RESTRICT , SCC_CSTD_C99) \ X(restrict , SCC_TOK_SUBTYPE_IDENTIFIER , SCC_TOK_RESTRICT , SCC_CSTD_C99) \
X(return , SCC_TOK_SUBTYPE_KEYWORD , SCC_TOK_RETURN , SCC_CSTD_C89) \ X(return , SCC_TOK_SUBTYPE_IDENTIFIER , SCC_TOK_RETURN , SCC_CSTD_C89) \
X(short , SCC_TOK_SUBTYPE_KEYWORD , SCC_TOK_SHORT , SCC_CSTD_C89) \ X(short , SCC_TOK_SUBTYPE_IDENTIFIER , SCC_TOK_SHORT , SCC_CSTD_C89) \
X(signed , SCC_TOK_SUBTYPE_KEYWORD , SCC_TOK_SIGNED , SCC_CSTD_C89) \ X(signed , SCC_TOK_SUBTYPE_IDENTIFIER , SCC_TOK_SIGNED , SCC_CSTD_C89) \
X(sizeof , SCC_TOK_SUBTYPE_KEYWORD , SCC_TOK_SIZEOF , SCC_CSTD_C89) \ X(sizeof , SCC_TOK_SUBTYPE_IDENTIFIER , SCC_TOK_SIZEOF , SCC_CSTD_C89) \
X(static , SCC_TOK_SUBTYPE_KEYWORD , SCC_TOK_STATIC , SCC_CSTD_C89) \ X(static , SCC_TOK_SUBTYPE_IDENTIFIER , SCC_TOK_STATIC , SCC_CSTD_C89) \
X(struct , SCC_TOK_SUBTYPE_KEYWORD , SCC_TOK_STRUCT , SCC_CSTD_C89) \ X(struct , SCC_TOK_SUBTYPE_IDENTIFIER , SCC_TOK_STRUCT , SCC_CSTD_C89) \
X(switch , SCC_TOK_SUBTYPE_KEYWORD , SCC_TOK_SWITCH , SCC_CSTD_C89) \ X(switch , SCC_TOK_SUBTYPE_IDENTIFIER , SCC_TOK_SWITCH , SCC_CSTD_C89) \
X(typedef , SCC_TOK_SUBTYPE_KEYWORD , SCC_TOK_TYPEDEF , SCC_CSTD_C89) \ X(typedef , SCC_TOK_SUBTYPE_IDENTIFIER , SCC_TOK_TYPEDEF , SCC_CSTD_C89) \
X(union , SCC_TOK_SUBTYPE_KEYWORD , SCC_TOK_UNION , SCC_CSTD_C89) \ X(union , SCC_TOK_SUBTYPE_IDENTIFIER , SCC_TOK_UNION , SCC_CSTD_C89) \
X(unsigned , SCC_TOK_SUBTYPE_KEYWORD , SCC_TOK_UNSIGNED , SCC_CSTD_C89) \ X(unsigned , SCC_TOK_SUBTYPE_IDENTIFIER , SCC_TOK_UNSIGNED , SCC_CSTD_C89) \
X(void , SCC_TOK_SUBTYPE_KEYWORD , SCC_TOK_VOID , SCC_CSTD_C89) \ X(void , SCC_TOK_SUBTYPE_IDENTIFIER , SCC_TOK_VOID , SCC_CSTD_C89) \
X(volatile , SCC_TOK_SUBTYPE_KEYWORD , SCC_TOK_VOLATILE , SCC_CSTD_C89) \ X(volatile , SCC_TOK_SUBTYPE_IDENTIFIER , SCC_TOK_VOLATILE , SCC_CSTD_C89) \
X(while , SCC_TOK_SUBTYPE_KEYWORD , SCC_TOK_WHILE , SCC_CSTD_C89) \ X(while , SCC_TOK_SUBTYPE_IDENTIFIER , SCC_TOK_WHILE , SCC_CSTD_C89) \
// KEYWORD_TABLE // KEYWORD_TABLE
#define SCC_CTOK_TABLE \ #define SCC_CTOK_TABLE \
@@ -60,6 +88,7 @@ typedef enum scc_cstd {
X(blank , SCC_TOK_SUBTYPE_EMPTYSPACE, SCC_TOK_BLANK ) \ X(blank , SCC_TOK_SUBTYPE_EMPTYSPACE, SCC_TOK_BLANK ) \
X(endline , SCC_TOK_SUBTYPE_EMPTYSPACE, SCC_TOK_ENDLINE ) \ X(endline , SCC_TOK_SUBTYPE_EMPTYSPACE, SCC_TOK_ENDLINE ) \
X("#" , SCC_TOK_SUBTYPE_OPERATOR, SCC_TOK_SHARP ) \ X("#" , SCC_TOK_SUBTYPE_OPERATOR, SCC_TOK_SHARP ) \
X("##" , SCC_TOK_SUBTYPE_OPERATOR, SCC_TOK_SHARP_SHARP ) \
X("==" , SCC_TOK_SUBTYPE_OPERATOR, SCC_TOK_EQ ) \ X("==" , SCC_TOK_SUBTYPE_OPERATOR, SCC_TOK_EQ ) \
X("=" , SCC_TOK_SUBTYPE_OPERATOR, SCC_TOK_ASSIGN ) \ X("=" , SCC_TOK_SUBTYPE_OPERATOR, SCC_TOK_ASSIGN ) \
X("++" , SCC_TOK_SUBTYPE_OPERATOR, SCC_TOK_ADD_ADD ) \ X("++" , SCC_TOK_SUBTYPE_OPERATOR, SCC_TOK_ADD_ADD ) \
@@ -116,22 +145,25 @@ typedef enum scc_cstd {
// END // END
/* clang-format on */ /* clang-format on */
// 定义TokenType枚举
typedef enum scc_tok_type { typedef enum scc_tok_type {
// 处理普通token /* clang-format off */
// must first becase the unknown token must be 0
#define X(str, subtype, tok) tok, #define X(str, subtype, tok) tok,
SCC_CTOK_TABLE SCC_CTOK_TABLE
#undef X #undef X
// 处理关键字(保持原有格式) #define X(name, type, tok) tok,
SCC_PPKEYWORD_TABLE
#undef X
#define X(name, subtype, tok, std) tok, #define X(name, subtype, tok, std) tok,
SCC_CKEYWORD_TABLE SCC_CKEYWORD_TABLE
#undef X #undef X
/* clang-format on*/
} scc_tok_type_t; } scc_tok_type_t;
typedef enum scc_tok_subtype { typedef enum scc_tok_subtype {
SCC_TOK_SUBTYPE_INVALID, // 错误占位 SCC_TOK_SUBTYPE_INVALID, // 错误占位
SCC_TOK_SUBTYPE_KEYWORD, // 关键字
SCC_TOK_SUBTYPE_OPERATOR, // 操作符 SCC_TOK_SUBTYPE_OPERATOR, // 操作符
SCC_TOK_SUBTYPE_IDENTIFIER, // 标识符 SCC_TOK_SUBTYPE_IDENTIFIER, // 标识符
SCC_TOK_SUBTYPE_LITERAL, // 字面量 SCC_TOK_SUBTYPE_LITERAL, // 字面量
@@ -141,22 +173,47 @@ typedef enum scc_tok_subtype {
SCC_TOK_SUBTYPE_EOF // 结束标记 SCC_TOK_SUBTYPE_EOF // 结束标记
} scc_tok_subtype_t; } scc_tok_subtype_t;
scc_tok_subtype_t scc_get_tok_subtype(scc_tok_type_t type);
const char *scc_get_tok_name(scc_tok_type_t type);
/** /**
* @brief * @brief
* @warning lexeme否则会出现内存泄漏 * @warning lexeme否则会出现内存泄漏
*/ */
typedef struct scc_lexer_token { struct scc_lexer_token {
scc_tok_type_t type; scc_tok_type_t type;
scc_cstring_t lexeme; scc_cstring_t lexeme;
scc_pos_t loc; scc_pos_t loc;
} scc_lexer_tok_t; };
scc_tok_subtype_t scc_get_tok_subtype(scc_tok_type_t type);
const char *scc_get_tok_name(scc_tok_type_t type);
static inline void scc_lexer_tok_drop(scc_lexer_tok_t *tok) {
tok->type = SCC_TOK_UNKNOWN;
tok->loc.col = 0;
tok->loc.line = 0;
tok->loc.name = null;
tok->loc.offset = 0;
scc_cstring_free(&tok->lexeme);
}
static inline cbool scc_lexer_tok_match(const scc_lexer_tok_t *tok, static inline cbool scc_lexer_tok_match(const scc_lexer_tok_t *tok,
scc_tok_type_t type) { scc_tok_type_t type) {
return tok->type == type; return tok->type == type;
} }
// 深拷贝 token
static inline scc_lexer_tok_t scc_lexer_tok_copy(const scc_lexer_tok_t *src) {
scc_lexer_tok_t dst = *src;
dst.lexeme = scc_cstring_copy(&src->lexeme);
return dst;
}
// 移动 token源 token 不再拥有 lexeme
static inline void scc_lexer_tok_move(scc_lexer_tok_t *dst,
scc_lexer_tok_t *src) {
*dst = *src;
src->lexeme.data = null;
src->lexeme.size = 0;
src->lexeme.cap = 0;
}
#endif /* __SCC_LEXER_TOKEN_H__ */ #endif /* __SCC_LEXER_TOKEN_H__ */

View File

@@ -0,0 +1,42 @@
#ifndef __SCC_LEXER_UTILS_H__
#define __SCC_LEXER_UTILS_H__
#include "scc_lexer.h"
static inline cbool scc_lexer_peek_non_blank(scc_lexer_tok_ring_t *stream,
scc_lexer_tok_t *out) {
cbool ok;
while (1) {
scc_ring_peek(*stream, *out, ok);
if (!ok || out->type != SCC_TOK_BLANK)
break;
scc_ring_next_consume(*stream, *out, ok);
scc_lexer_tok_drop(out);
}
return ok;
}
static inline cbool scc_lexer_next_non_blank(scc_lexer_tok_ring_t *stream,
scc_lexer_tok_t *out) {
cbool ok;
if (!scc_lexer_peek_non_blank(stream, out))
return false;
scc_ring_next_consume(*stream, *out, ok);
return true;
}
static inline void scc_lexer_skip_until_newline(scc_lexer_tok_ring_t *stream) {
scc_lexer_tok_t tok;
cbool ok;
while (1) {
scc_ring_next_consume(*stream, tok, ok);
if (!ok)
break;
scc_tok_type_t type = tok.type;
scc_lexer_tok_drop(&tok);
if (type == SCC_TOK_ENDLINE)
break;
}
}
#endif /* __SCC_LEXER_UTILS_H__ */

View File

@@ -5,7 +5,7 @@
static const struct { static const struct {
const char *name; const char *name;
scc_cstd_t std_type; scc_cstd_t std_type;
scc_tok_type_t tok; scc_tok_type_t tok_type;
} keywords[] = { } keywords[] = {
#define X(name, subtype, tok, std_type, ...) {#name, std_type, tok}, #define X(name, subtype, tok, std_type, ...) {#name, std_type, tok},
SCC_CKEYWORD_TABLE SCC_CKEYWORD_TABLE
@@ -88,7 +88,7 @@ static inline cbool next_char(scc_lexer_t *lexer, scc_cstring_t *lexeme,
#define set_err_token(token) ((token)->type = SCC_TOK_UNKNOWN) #define set_err_token(token) ((token)->type = SCC_TOK_UNKNOWN)
void scc_lexer_get_token(scc_lexer_t *lexer, scc_lexer_tok_t *token) { void scc_lexer_get_token(scc_lexer_t *lexer, scc_lexer_tok_t *token) {
scc_sstream_char_t cur; scc_sstream_char_t cur = {0};
scc_cstring_t lex = scc_cstring_create(); // 临时lexeme scc_cstring_t lex = scc_cstring_create(); // 临时lexeme
// 尝试预览第一个字符 // 尝试预览第一个字符
@@ -168,7 +168,7 @@ void scc_lexer_get_token(scc_lexer_t *lexer, scc_lexer_tok_t *token) {
// 检查是否为关键字 // 检查是否为关键字
int idx = keyword_cmp(scc_cstring_as_cstr(&lex), scc_cstring_len(&lex)); int idx = keyword_cmp(scc_cstring_as_cstr(&lex), scc_cstring_len(&lex));
if (idx != -1) { if (idx != -1) {
token->type = keywords[idx].tok; token->type = keywords[idx].tok_type;
} }
} else if (is_digit(ch)) { } else if (is_digit(ch)) {
// 数字字面量(整数/浮点) // 数字字面量(整数/浮点)
@@ -439,6 +439,10 @@ void scc_lexer_get_token(scc_lexer_t *lexer, scc_lexer_tok_t *token) {
token->type = SCC_TOK_COND; token->type = SCC_TOK_COND;
break; break;
case '#': case '#':
if (next.character == '#') {
token->type = SCC_TOK_SHARP_SHARP;
next_char(lexer, &lex, &cur);
} else
token->type = SCC_TOK_SHARP; token->type = SCC_TOK_SHARP;
break; break;
default: default:
@@ -461,33 +465,43 @@ void scc_lexer_get_token(scc_lexer_t *lexer, scc_lexer_tok_t *token) {
// scc_lexer_get_token maybe got invalid (with parser) // scc_lexer_get_token maybe got invalid (with parser)
void scc_lexer_get_valid_token(scc_lexer_t *lexer, scc_lexer_tok_t *token) { void scc_lexer_get_valid_token(scc_lexer_t *lexer, scc_lexer_tok_t *token) {
scc_tok_subtype_t subtype; scc_tok_subtype_t subtype;
do { while (1) {
scc_lexer_get_token(lexer, token); scc_lexer_get_token(lexer, token);
subtype = scc_get_tok_subtype(token->type); subtype = scc_get_tok_subtype(token->type);
AssertFmt(subtype != SCC_TOK_SUBTYPE_INVALID, AssertFmt(subtype != SCC_TOK_SUBTYPE_INVALID,
"Invalid token: `%s` at %s:%d:%d", "Invalid token: `%s` at %s:%d:%d",
scc_get_tok_name(token->type), token->loc.name, scc_get_tok_name(token->type), token->loc.name,
token->loc.line, token->loc.col); token->loc.line, token->loc.col);
} while (subtype == SCC_TOK_SUBTYPE_EMPTYSPACE || if (subtype == SCC_TOK_SUBTYPE_EMPTYSPACE ||
subtype == SCC_TOK_SUBTYPE_COMMENT); subtype == SCC_TOK_SUBTYPE_COMMENT) {
scc_lexer_tok_drop(token);
}
break;
};
} }
static int fill_token(scc_lexer_tok_t *out, void *userdata) { static cbool fill_token(scc_lexer_tok_t *out, void *userdata) {
scc_lexer_t *lexer = userdata; scc_lexer_t *lexer = userdata;
scc_lexer_get_token(lexer, out); scc_lexer_get_token(lexer, out);
return 0; if (out->type == SCC_TOK_EOF) {
return false;
}
return true;
} }
static int fill_valid_token(scc_lexer_tok_t *out, void *userdata) { static cbool fill_valid_token(scc_lexer_tok_t *out, void *userdata) {
scc_lexer_t *lexer = userdata; scc_lexer_t *lexer = userdata;
scc_lexer_get_valid_token(lexer, out); scc_lexer_get_valid_token(lexer, out);
return 0; if (out->type == SCC_TOK_EOF) {
return false;
}
return true;
} }
scc_lexer_tok_ring_t *scc_lexer_to_ring(scc_lexer_t *lexer, int ring_size, scc_lexer_tok_ring_t *scc_lexer_to_ring(scc_lexer_t *lexer, int ring_size,
cbool need_comment) { cbool fill_all) {
scc_ring_init(lexer->ring, ring_size, scc_ring_init(lexer->ring, ring_size,
need_comment ? fill_token : fill_valid_token, lexer); fill_all ? fill_token : fill_valid_token, lexer);
lexer->ring_ref_count++; lexer->ring_ref_count++;
return &lexer->ring; return &lexer->ring;
} }

View File

@@ -40,16 +40,24 @@ int main(int argc, char *argv[]) {
scc_lexer_t lexer; scc_lexer_t lexer;
scc_sstream_t stream; scc_sstream_t stream;
scc_sstream_init(&stream, file_name, 16); scc_sstream_init(&stream, file_name, 16);
scc_sstream_ring_t *ref = scc_sstream_ref_ring(&stream); scc_sstream_ring_t *ref = scc_sstream_to_ring(&stream);
scc_lexer_init(&lexer, ref); scc_lexer_init(&lexer, ref);
scc_lexer_tok_t token; scc_lexer_tok_t token;
scc_lexer_tok_ring_t *tok_ring = scc_lexer_to_ring(&lexer, 16, false);
int ok;
while (1) { while (1) {
scc_lexer_get_valid_token(&lexer, &token); // scc_lexer_get_valid_token(&lexer, &token);
if (token.type == SCC_TOK_EOF) { // if (token.type == SCC_TOK_EOF) {
// break;
// }
scc_ring_next_consume(*tok_ring, token, ok);
if (!ok) {
break; break;
} }
LOG_DEBUG("get token [%-8s] `%s` at %s:%d:%d",
LOG_INFO("get token [%-8s] `%s` at %s:%d:%d",
scc_get_tok_name(token.type), scc_get_tok_name(token.type),
scc_cstring_as_cstr(&token.lexeme), token.loc.name, scc_cstring_as_cstr(&token.lexeme), token.loc.name,
token.loc.line, token.loc.col); token.loc.line, token.loc.col);

View File

@@ -1,4 +1,4 @@
#include <lexer_token.h> #include <scc_lexer_token.h>
// 生成字符串映射(根据需求选择#str或#name // 生成字符串映射(根据需求选择#str或#name
static const char *token_strings[] = { static const char *token_strings[] = {

View File

@@ -13,7 +13,7 @@ static void free_token(scc_lexer_tok_t *tok) { scc_cstring_free(&tok->lexeme); }
scc_lexer_tok_t token; \ scc_lexer_tok_t token; \
scc_sstream_t stream; \ scc_sstream_t stream; \
scc_sstream_init_by_buffer(&stream, input, strlen(input), 0, 16); \ scc_sstream_init_by_buffer(&stream, input, strlen(input), 0, 16); \
scc_sstream_ring_t *ref = scc_sstream_ref_ring(&stream); \ scc_sstream_ring_t *ref = scc_sstream_to_ring(&stream); \
scc_lexer_init(&lexer, ref); \ scc_lexer_init(&lexer, ref); \
scc_lexer_get_token(&lexer, &token); \ scc_lexer_get_token(&lexer, &token); \
\ \
@@ -34,7 +34,7 @@ static void free_token(scc_lexer_tok_t *tok) { scc_cstring_free(&tok->lexeme); }
scc_lexer_tok_t token; \ scc_lexer_tok_t token; \
scc_sstream_t stream; \ scc_sstream_t stream; \
scc_sstream_init_by_buffer(&stream, input, strlen(input), 0, 16); \ scc_sstream_init_by_buffer(&stream, input, strlen(input), 0, 16); \
scc_sstream_ring_t *ref = scc_sstream_ref_ring(&stream); \ scc_sstream_ring_t *ref = scc_sstream_to_ring(&stream); \
scc_lexer_init(&lexer, ref); \ scc_lexer_init(&lexer, ref); \
\ \
scc_tok_type_t expected[] = {__VA_ARGS__}; \ scc_tok_type_t expected[] = {__VA_ARGS__}; \
@@ -301,8 +301,7 @@ void test_identifiers() {
void test_preprocessor() { void test_preprocessor() {
TEST_CASE("Preprocessor directives - just the # token"); TEST_CASE("Preprocessor directives - just the # token");
TEST_TOKEN("#", SCC_TOK_SHARP); TEST_TOKEN("#", SCC_TOK_SHARP);
TEST_TOKEN("##", SCC_TOK_SHARP); // 第一个 # 是 token第二个 # 将是下一个 TEST_TOKEN("##", SCC_TOK_SHARP_SHARP);
// token在序列测试中验证
// 多 token 序列测试 #include 等 // 多 token 序列测试 #include 等
TEST_SEQUENCE("#include <stdio.h>", SCC_TOK_SHARP, SCC_TOK_IDENT, TEST_SEQUENCE("#include <stdio.h>", SCC_TOK_SHARP, SCC_TOK_IDENT,
@@ -311,6 +310,18 @@ void test_preprocessor() {
TEST_SEQUENCE("#define FOO 123", SCC_TOK_SHARP, SCC_TOK_IDENT, TEST_SEQUENCE("#define FOO 123", SCC_TOK_SHARP, SCC_TOK_IDENT,
SCC_TOK_BLANK, SCC_TOK_IDENT, SCC_TOK_BLANK, SCC_TOK_BLANK, SCC_TOK_IDENT, SCC_TOK_BLANK,
SCC_TOK_INT_LITERAL); SCC_TOK_INT_LITERAL);
TEST_SEQUENCE("#define FOO(x) x + 1", SCC_TOK_SHARP, SCC_TOK_IDENT,
SCC_TOK_BLANK, SCC_TOK_IDENT, SCC_TOK_L_PAREN, SCC_TOK_IDENT,
SCC_TOK_R_PAREN, SCC_TOK_BLANK, SCC_TOK_IDENT, SCC_TOK_BLANK,
SCC_TOK_ADD, SCC_TOK_BLANK, SCC_TOK_INT_LITERAL);
TEST_SEQUENCE("#undef FOO", SCC_TOK_SHARP, SCC_TOK_IDENT, SCC_TOK_BLANK,
SCC_TOK_IDENT);
TEST_SEQUENCE("#error \"This is an error\"", SCC_TOK_SHARP, SCC_TOK_IDENT,
SCC_TOK_BLANK, SCC_TOK_STRING_LITERAL);
TEST_SEQUENCE("#warning \"This is an warning\"\n", SCC_TOK_SHARP,
SCC_TOK_IDENT, SCC_TOK_BLANK, SCC_TOK_STRING_LITERAL,
SCC_TOK_ENDLINE);
} }
void test_edge_cases() { void test_edge_cases() {
@@ -348,7 +359,7 @@ void test_sequences() {
TEST_SEQUENCE("<<=", SCC_TOK_ASSIGN_L_SH); TEST_SEQUENCE("<<=", SCC_TOK_ASSIGN_L_SH);
TEST_SEQUENCE("...", SCC_TOK_ELLIPSIS); TEST_SEQUENCE("...", SCC_TOK_ELLIPSIS);
TEST_SEQUENCE("->", SCC_TOK_DEREF); TEST_SEQUENCE("->", SCC_TOK_DEREF);
TEST_SEQUENCE("##", SCC_TOK_SHARP, SCC_TOK_SHARP); // 两个预处理记号 TEST_SEQUENCE("##", SCC_TOK_SHARP_SHARP); // 两个预处理记号
TEST_CASE("Comments and whitespace interleaved"); TEST_CASE("Comments and whitespace interleaved");
TEST_SEQUENCE("/* comment */ a // line comment\n b", SCC_TOK_BLOCK_COMMENT, TEST_SEQUENCE("/* comment */ a // line comment\n b", SCC_TOK_BLOCK_COMMENT,
@@ -371,18 +382,18 @@ void test_error_recovery() {
// 测试未闭合的字符字面量:词法分析器可能继续直到遇到换行或 EOF // 测试未闭合的字符字面量:词法分析器可能继续直到遇到换行或 EOF
// 这里假设它会产生一个 SCC_TOK_CHAR_LITERAL 但包含到结束 // 这里假设它会产生一个 SCC_TOK_CHAR_LITERAL 但包含到结束
// 但标准 C 中未闭合是错误,我们可能返回 UNKNOWN // 但标准 C 中未闭合是错误,我们可能返回 UNKNOWN
TEST_CASE("Unterminated character literal"); // TEST_CASE("Unterminated character literal");
TEST_TOKEN("'a", SCC_TOK_UNKNOWN); // 取决于实现,可能为 CHAR_LITERAL // TEST_TOKEN("'a", SCC_TOK_UNKNOWN); // 取决于实现,可能为 CHAR_LITERAL
// 更可靠的测试:序列中下一个 token 是什么 // // 更可靠的测试:序列中下一个 token 是什么
TEST_SEQUENCE("'a b", SCC_TOK_UNKNOWN, // TEST_SEQUENCE("'a b", SCC_TOK_UNKNOWN,
SCC_TOK_IDENT); // 假设第一个 token 是错误 // SCC_TOK_IDENT); // 假设第一个 token 是错误
TEST_CASE("Unterminated string literal"); // TEST_CASE("Unterminated string literal");
TEST_TOKEN("\"hello", SCC_TOK_UNKNOWN); // 同样 // TEST_TOKEN("\"hello", SCC_TOK_UNKNOWN); // 同样
TEST_CASE("Unterminated block comment"); // TEST_CASE("Unterminated block comment");
TEST_SEQUENCE("/* comment", // TEST_SEQUENCE("/* comment",
SCC_TOK_BLOCK_COMMENT); // 直到 EOF可能仍为注释 // SCC_TOK_BLOCK_COMMENT); // 直到 EOF可能仍为注释
} }
// ============================ 主测试列表 ============================ // ============================ 主测试列表 ============================

7
libs/pproc/cbuild.toml Normal file
View File

@@ -0,0 +1,7 @@
[package]
name = "scc_pprocesser"
dependencies = [
{ name = "scc_utils", path = "../../runtime/scc_utils" },
{ name = "lexer", path = "../lexer" },
]

View File

@@ -0,0 +1,33 @@
#ifndef __SCC_PPROC_EXPAND_H__
#define __SCC_PPROC_EXPAND_H__
#include <pproc_macro.h>
#include <scc_core.h>
#include <scc_core_ring.h>
#include <scc_lexer.h>
typedef struct {
scc_pproc_macro_table_t *macro_table;
scc_pproc_macro_table_t *expanded_set;
scc_lexer_tok_ring_t *input;
scc_lexer_tok_vec_t output;
int need_rescan;
} scc_pproc_expand_t;
static inline scc_lexer_tok_ring_t
scc_lexer_array_to_ring(scc_lexer_tok_vec_t *array) {
scc_lexer_tok_ring_t ret;
scc_ring_by_buffer(ret, array->data, array->size);
return ret;
}
void scc_pproc_expand_macro(scc_pproc_expand_t *expand_ctx);
void scc_pproc_expand_by_src(scc_pproc_macro_table_t *macro_table,
scc_lexer_tok_ring_t *input,
scc_lexer_tok_ring_t *output,
const scc_pproc_macro_t *macro);
void scc_pproc_expand_by_vec(scc_pproc_macro_table_t *macro_table,
scc_lexer_tok_vec_t *input,
scc_lexer_tok_ring_t *output);
#endif /* __SCC_PPROC_EXPAND_H__ */

View File

@@ -0,0 +1,97 @@
#ifndef __SCC_PP_MACRO_H__
#define __SCC_PP_MACRO_H__
#include <scc_core.h>
#include <scc_lexer.h>
#include <scc_utils.h>
// 宏定义类型
typedef enum {
SCC_PP_MACRO_NONE, // 不是宏
SCC_PP_MACRO_OBJECT, // 对象宏
SCC_PP_MACRO_FUNCTION, // 函数宏
} scc_pproc_macro_type_t;
typedef SCC_VEC(scc_lexer_tok_vec_t) scc_pproc_macro_extened_params_t;
// 宏定义结构
typedef struct scc_macro {
scc_cstring_t name; // 宏名称
scc_pproc_macro_type_t type; // 宏类型
scc_lexer_tok_vec_t replaces; // 替换列表
scc_lexer_tok_vec_t params; // 参数列表(仅函数宏)
} scc_pproc_macro_t;
typedef struct scc_macro_table {
scc_hashtable_t table; // 宏定义表
} scc_pproc_macro_table_t;
/**
* @brief 创建宏对象
* @param name 宏名称
* @param type 宏类型
* @return 创建的宏对象指针失败返回NULL
*/
scc_pproc_macro_t *scc_pproc_macro_new(const scc_cstring_t *name,
scc_pproc_macro_type_t type);
/**
* @brief 销毁宏对象
* @param macro 要销毁的宏对象
*/
void scc_pproc_macro_drop(scc_pproc_macro_t *macro);
/**
* @brief 添加对象宏
* @param pp 预处理器实例
* @param name 宏名称
* @param replacement 替换文本列表
* @return 成功返回true失败返回false
*/
cbool scc_pproc_add_object_macro(scc_pproc_macro_table_t *pp,
const scc_cstring_t *name,
const scc_lexer_tok_vec_t *replacement);
/**
* @brief 添加函数宏
* @param pp 预处理器实例
* @param name 宏名称
* @param params 参数列表
* @param replacement 替换文本列表
* @return 成功返回true失败返回false
*/
cbool scc_pproc_add_function_macro(scc_pproc_macro_table_t *pp,
const scc_cstring_t *name,
const scc_lexer_tok_vec_t *params,
const scc_lexer_tok_vec_t *replacement);
/**
* @brief
*
* @param pp
* @param macro
* @return scc_pproc_macro_t*
*/
scc_pproc_macro_t *scc_pproc_macro_table_set(scc_pproc_macro_table_t *pp,
scc_pproc_macro_t *macro);
/**
* @brief 查找宏定义
* @param pp 预处理器实例
* @param name 宏名称
* @return 找到的宏对象指针未找到返回NULL
*/
scc_pproc_macro_t *scc_pproc_macro_table_get(scc_pproc_macro_table_t *pp,
const scc_cstring_t *name);
/**
* @brief 从预处理器中删除宏
* @param pp 预处理器实例
* @param name 宏名称
* @return 成功删除返回true未找到返回false
*/
cbool scc_pproc_macro_table_remove(scc_pproc_macro_table_t *pp,
const scc_cstring_t *name);
void scc_pproc_marco_table_init(scc_pproc_macro_table_t *macros);
void scc_pproc_macro_table_drop(scc_pproc_macro_table_t *macros);
#endif /* __SCC_PP_MACRO_H__ */

View File

@@ -0,0 +1,84 @@
/**
* @file pprocessor.h
* @brief C语言预处理器核心数据结构与接口
*/
#ifndef __SCC_PPROC_H__
#define __SCC_PPROC_H__
#include "pproc_macro.h"
#include <scc_core.h>
#include <scc_core_ring.h>
#include <scc_lexer.h>
// 预处理器状态结构
// 条件编译状态栈
typedef struct {
cbool found_true;
cbool seen_else;
cbool active;
} scc_pproc_if_t;
typedef SCC_VEC(scc_pproc_if_t) scc_pproc_if_stack_t;
typedef struct {
scc_sstream_t sstream;
scc_lexer_t lexer;
scc_lexer_tok_ring_t *ring;
} scc_pproc_file_t;
typedef SCC_VEC(scc_pproc_file_t *) scc_pproc_file_stack_t;
typedef SCC_VEC(scc_lexer_tok_ring_t *) scc_pproc_ring_vec_t;
typedef SCC_VEC(scc_cstring_t) scc_pproc_cstr_vec_t;
typedef struct scc_pproc {
scc_lexer_tok_ring_t *org_ring;
scc_lexer_tok_ring_t *cur_ring;
scc_lexer_tok_ring_t expanded_ring;
scc_strpool_t strpool;
int at_line_start;
int enable;
scc_pproc_cstr_vec_t include_paths;
scc_pproc_macro_table_t macro_table;
scc_pproc_if_stack_t if_stack;
scc_pproc_file_stack_t file_stack;
scc_lexer_tok_ring_t ring;
int ring_ref_count;
struct {
int max_include_depth;
} config;
} scc_pproc_t;
void scc_pproc_init(scc_pproc_t *pp, scc_lexer_tok_ring_t *input);
scc_lexer_tok_ring_t *scc_pproc_to_ring(scc_pproc_t *pp, int ring_size);
void scc_pproc_drop(scc_pproc_t *pp);
static inline void scc_pproc_add_include_path(scc_pproc_t *pp,
const scc_cstring_t *path) {
scc_vec_push(pp->include_paths, scc_cstring_copy(path));
}
static inline void scc_pproc_add_include_path_cstr(scc_pproc_t *pp,
const char *path) {
scc_vec_push(pp->include_paths, scc_cstring_from_cstr(path));
}
void scc_pproc_handle_directive(scc_pproc_t *pp);
cbool scc_pproc_parse_if_defined(scc_pproc_t *pp, scc_tok_type_t type,
const scc_lexer_tok_t *tok);
cbool scc_pproc_parse_if_condition(scc_pproc_t *pp, scc_tok_type_t type,
scc_lexer_tok_ring_t *tok_ring);
void scc_pproc_parse_include(scc_pproc_t *pp, scc_lexer_tok_t *include_tok,
scc_lexer_tok_ring_t *tok_ring);
void scc_pproc_parse_macro_arguments(scc_lexer_tok_ring_t *ring,
scc_lexer_tok_vec_t *args, int need_full);
void scc_pproc_parse_function_macro(scc_pproc_t *pp,
const scc_lexer_tok_t *ident);
void scc_pproc_parse_object_macro(scc_pproc_t *pp,
const scc_lexer_tok_t *ident);
#endif /* __SCC_PPROC_H__ */

View File

@@ -0,0 +1,398 @@
#include <pproc_expand.h>
#include <scc_core_ring.h>
#include <scc_lexer_utils.h>
#include <scc_pproc.h>
static const struct {
const char *name;
scc_tok_type_t tok_type;
} keywords[] = {
#define X(name, type, tok) {#name, tok},
SCC_PPKEYWORD_TABLE
#undef X
};
// 使用二分查找查找关键字
static inline int keyword_cmp(const char *name, int len) {
int low = 0;
int high = sizeof(keywords) / sizeof(keywords[0]) - 1;
while (low <= high) {
int mid = (low + high) / 2;
const char *key = keywords[mid].name;
int cmp = 0;
// 自定义字符串比较逻辑
for (int i = 0; i < len; i++) {
if (name[i] != key[i]) {
cmp = (unsigned char)name[i] - (unsigned char)key[i];
break;
}
if (name[i] == '\0')
break; // 遇到终止符提前结束
}
if (cmp == 0) {
// 完全匹配检查(长度相同)
if (key[len] == '\0')
return mid;
cmp = -1; // 当前关键词比输入长
}
if (cmp < 0) {
high = mid - 1;
} else {
low = mid + 1;
}
}
return -1; // Not a keyword.
}
void scc_pproc_parse_macro_arguments(scc_lexer_tok_ring_t *ring,
scc_lexer_tok_vec_t *args, int need_full) {
Assert(ring != null && args != null);
scc_lexer_tok_t tok = {0};
int depth = 0;
int ok;
do {
scc_ring_next_consume(*ring, tok, ok);
if (!ok) {
return;
}
// scc_lexer_next_non_blank(ring, &tok);
if (tok.type == SCC_TOK_R_PAREN) {
depth--;
}
ok = depth > 0 || need_full;
if (ok) {
scc_vec_push(*args, tok);
}
if (tok.type == SCC_TOK_L_PAREN) {
depth++;
}
if (!ok) {
scc_lexer_tok_drop(&tok);
}
} while (depth);
}
static inline void fill_replacements(scc_pproc_t *pp,
scc_pproc_macro_t *macro) {
int ok;
scc_lexer_tok_t tok;
ok = scc_lexer_next_non_blank(pp->cur_ring, &tok);
if (!ok || tok.type == SCC_TOK_EOF || tok.type == SCC_TOK_ENDLINE) {
return;
} else {
scc_vec_push(macro->replaces, tok);
}
while (1) {
scc_ring_next_consume(*pp->cur_ring, tok, ok);
if (!ok)
break;
if (tok.type == SCC_TOK_EOF || tok.type == SCC_TOK_ENDLINE) {
scc_lexer_tok_drop(&tok);
break;
}
scc_vec_push(macro->replaces, tok);
}
}
void scc_pproc_parse_function_macro(scc_pproc_t *pp,
const scc_lexer_tok_t *ident) {
scc_lexer_tok_vec_t args;
scc_vec_init(args);
scc_pproc_parse_macro_arguments(pp->cur_ring, &args, false);
scc_pproc_macro_t *macro =
scc_pproc_macro_new(&ident->lexeme, SCC_PP_MACRO_FUNCTION);
/*
check and set params
1. identifier-list(opt)
2. ...
3. identifier-list , ...
*/
int idx = 0;
scc_vec_foreach(args, i) {
scc_lexer_tok_t *arg = &scc_vec_at(args, i);
if (arg->type == SCC_TOK_COMMA) {
scc_lexer_tok_drop(arg);
if (idx++ % 2 != 1) {
LOG_FATAL("ERROR");
}
} else if (arg->type == SCC_TOK_IDENT) {
if (idx++ % 2 != 0) {
LOG_FATAL("ERROR");
}
scc_vec_push(macro->params, *arg);
} else if (arg->type == SCC_TOK_ELLIPSIS) {
if (idx++ % 2 != 0) {
LOG_FATAL("ERROR");
}
scc_cstring_t va_args = scc_cstring_from_cstr("__VA_ARGS__");
scc_cstring_free(&arg->lexeme);
arg->lexeme = va_args;
scc_vec_push(macro->params, *arg);
} else if (scc_get_tok_subtype(arg->type) ==
SCC_TOK_SUBTYPE_EMPTYSPACE ||
scc_get_tok_subtype(arg->type) == SCC_TOK_SUBTYPE_COMMENT) {
scc_lexer_tok_drop(arg);
} else {
LOG_FATAL("ERROR");
}
}
fill_replacements(pp, macro);
scc_pproc_macro_table_set(&pp->macro_table, macro);
}
void scc_pproc_parse_object_macro(scc_pproc_t *pp,
const scc_lexer_tok_t *ident) {
scc_pproc_macro_t *macro =
scc_pproc_macro_new(&ident->lexeme, SCC_PP_MACRO_OBJECT);
fill_replacements(pp, macro);
scc_pproc_macro_table_set(&pp->macro_table, macro);
}
static void scc_pproc_parse_line_and_expand(scc_pproc_t *pp,
scc_lexer_tok_ring_t *out_ring) {
int ok;
scc_lexer_tok_t tok;
scc_lexer_tok_ring_t *stream = pp->cur_ring;
scc_lexer_tok_vec_t org_toks;
scc_vec_init(org_toks);
while (1) {
scc_ring_peek(*stream, tok, ok);
if (ok == false)
break;
scc_ring_next_consume(*stream, tok, ok);
scc_vec_push(org_toks, tok);
if (tok.type == SCC_TOK_ENDLINE)
break;
}
scc_pproc_expand_by_vec(&pp->macro_table, &org_toks, out_ring);
}
/*
```txt
6.10 Preprocessing directives
preprocessing-file:
group(opt)
group:
group-part
group group-part
group-part:
if-section
control-line
text-line
# non-directive
if-section:
if-group elif-groups(opt) else-group(opt) endif-line
if-group:
# if constant-expression new-line group(opt)
# ifdef identifier new-line group(opt)
# ifndef identifier new-line group(opt)
elif-groups:
elif-group
elif-groups elif-group
elif-group:
#elif constant-expression new-line group(opt)
else-group:
# else new-line group(opt)
endif-line:
# endif new-line
control-line:
# include pp-tokens new-line
# define identifier replacement-list new-line
# define identifier lparen identifier-list(opt) )
replacement-list new-line
# define identifier lparen ... ) replacement-list new-line
# define identifier lparen identifier-list ,... )
replacement-list new-line
# undef identifier new-line
# line pp-tokens new-line
# error pp-tokens(opt) new-line
# pragma pp-tokens(opt) new-line
# new-line
text-line:
pp-tokens(opt) new-line
non-directive:
pp-tokens new-line
lparen:
`a ( character not immediately preceded by white-space`
replacement-list:
pp-tokens(opt)
pp-tokens:
preprocessing-token
pp-tokens preprocessing-token
new-line:
the new-line character
```
*/
void scc_pproc_handle_directive(scc_pproc_t *pp) {
scc_lexer_tok_t tok = {0};
int ok = 0;
scc_ring_next(*pp->cur_ring, tok, ok);
Assert(ok == true && tok.type == SCC_TOK_SHARP);
scc_lexer_tok_drop(&tok);
if (!scc_lexer_next_non_blank(pp->cur_ring, &tok) ||
scc_get_tok_subtype(tok.type) != SCC_TOK_SUBTYPE_IDENTIFIER) {
scc_lexer_tok_drop(&tok);
LOG_ERROR("Invalid preprocessor directive");
goto ERROR;
}
int ret = keyword_cmp(scc_cstring_as_cstr(&tok.lexeme),
scc_cstring_len(&tok.lexeme));
if (ret == -1) {
scc_lexer_tok_drop(&tok);
LOG_ERROR("Expected preprocessor keyword, got %s", tok.lexeme);
goto ERROR;
}
scc_tok_type_t type = keywords[ret].tok_type;
if (scc_vec_size(pp->if_stack) != 0) {
scc_pproc_if_t *top =
&scc_vec_at(pp->if_stack, scc_vec_size(pp->if_stack) - 1);
pp->enable = top->active;
}
// 非条件指令,且当前不活动时直接跳过整行
int is_conditional = 0;
switch (type) {
case SCC_PP_TOK_IFDEF:
case SCC_PP_TOK_IFNDEF:
case SCC_PP_TOK_ELIFDEF:
case SCC_PP_TOK_ELIFNDEF:
case SCC_PP_TOK_ELSE:
case SCC_PP_TOK_ENDIF:
case SCC_PP_TOK_IF:
case SCC_PP_TOK_ELIF:
is_conditional = 1;
break;
default:
break;
}
if (!is_conditional && !pp->enable) {
scc_lexer_skip_until_newline(pp->cur_ring);
return;
}
switch (type) {
case SCC_PP_TOK_DEFINE: {
scc_lexer_tok_drop(&tok);
scc_lexer_next_non_blank(pp->cur_ring, &tok);
if (tok.type != SCC_TOK_IDENT) {
scc_lexer_tok_drop(&tok);
LOG_ERROR("expected identifier");
goto ERROR;
}
scc_lexer_tok_t next_tok;
scc_ring_peek(*pp->cur_ring, next_tok, ok);
if (!ok) {
LOG_ERROR("unexpected EOF");
goto ERROR;
}
if (next_tok.type == SCC_TOK_L_PAREN) {
// function macro
scc_pproc_parse_function_macro(pp, &tok);
} else {
// object macro
scc_pproc_parse_object_macro(pp, &tok);
}
scc_lexer_tok_drop(&tok);
// FIXME
return;
}
case SCC_PP_TOK_UNDEF: {
scc_lexer_tok_drop(&tok);
scc_lexer_next_non_blank(pp->cur_ring, &tok);
if (tok.type != SCC_TOK_IDENT) {
scc_lexer_tok_drop(&tok);
LOG_ERROR("expected identifier");
goto ERROR;
}
scc_pproc_macro_table_remove(&pp->macro_table, &tok.lexeme);
scc_lexer_tok_drop(&tok);
scc_lexer_next_non_blank(pp->cur_ring, &tok);
if (tok.type != SCC_TOK_ENDLINE) {
scc_lexer_tok_drop(&tok);
LOG_ERROR("expected newline");
goto ERROR;
}
scc_lexer_tok_drop(&tok);
return;
}
case SCC_PP_TOK_INCLUDE: {
scc_lexer_tok_ring_t out_ring;
scc_pproc_parse_line_and_expand(pp, &out_ring);
scc_pproc_parse_include(pp, &tok, &out_ring);
return;
}
case SCC_PP_TOK_IFDEF:
case SCC_PP_TOK_IFNDEF:
case SCC_PP_TOK_ELIFDEF:
case SCC_PP_TOK_ELIFNDEF: {
scc_lexer_tok_drop(&tok);
if (!scc_lexer_next_non_blank(pp->cur_ring, &tok) ||
tok.type != SCC_TOK_IDENT) {
scc_lexer_tok_drop(&tok);
LOG_ERROR("expected identifier");
} else {
scc_pproc_parse_if_defined(pp, type, &tok);
scc_lexer_tok_drop(&tok);
}
scc_lexer_skip_until_newline(pp->cur_ring);
return;
}
case SCC_PP_TOK_ELSE:
case SCC_PP_TOK_ENDIF: {
scc_lexer_tok_drop(&tok);
scc_pproc_parse_if_defined(pp, type, null);
scc_lexer_skip_until_newline(pp->cur_ring);
return;
}
case SCC_PP_TOK_IF:
case SCC_PP_TOK_ELIF: {
scc_lexer_tok_drop(&tok);
scc_lexer_tok_ring_t out_ring;
scc_pproc_parse_line_and_expand(pp, &out_ring);
scc_pproc_parse_if_condition(pp, type, &out_ring);
return;
}
case SCC_PP_TOK_LINE:
case SCC_PP_TOK_EMBED:
goto ERROR;
case SCC_PP_TOK_ERROR:
scc_lexer_tok_drop(&tok);
while (1) {
ok = scc_lexer_next_non_blank(pp->cur_ring, &tok);
if (tok.type == SCC_TOK_ENDLINE || ok == false) {
return;
}
if (scc_get_tok_subtype(tok.type) == SCC_TOK_SUBTYPE_LITERAL) {
LOG_ERROR(scc_cstring_as_cstr(&tok.lexeme));
}
scc_lexer_tok_drop(&tok);
}
return;
case SCC_PP_TOK_WARNING:
scc_lexer_tok_drop(&tok);
while (1) {
ok = scc_lexer_next_non_blank(pp->cur_ring, &tok);
if (tok.type == SCC_TOK_ENDLINE || ok == false) {
return;
}
if (scc_get_tok_subtype(tok.type) == SCC_TOK_SUBTYPE_LITERAL) {
LOG_WARN(scc_cstring_as_cstr(&tok.lexeme));
}
scc_lexer_tok_drop(&tok);
}
return;
case SCC_PP_TOK_PRAGMA:
LOG_WARN("Pragma ignored");
scc_lexer_skip_until_newline(pp->cur_ring);
return;
default:
break;
}
ERROR:
LOG_WARN("Unhandled directive: %s", scc_cstring_as_cstr(&tok.lexeme));
scc_lexer_skip_until_newline(pp->cur_ring);
}

View File

@@ -0,0 +1,509 @@
#include <pproc_expand.h>
#include <scc_pproc.h>
static scc_lexer_tok_t stringify_argument(scc_lexer_tok_vec_t *arg_tokens) {
// WRITE BY AI
scc_cstring_t str = scc_cstring_create();
scc_cstring_append_ch(&str, '\"'); // 左引号
int need_space = 0; // 是否需要插入空格
scc_vec_foreach(*arg_tokens, i) {
scc_lexer_tok_t *tok = &scc_vec_at(*arg_tokens, i);
if (tok->type == SCC_TOK_BLANK) {
need_space = 1; // 标记遇到空白
continue;
}
// 需要空格且当前不是第一个有效token插入一个空格
if (need_space && i > 0) {
scc_cstring_append_ch(&str, ' ');
}
// 对字符串/字符常量内的 " 和 \ 进行转义
if (tok->type == SCC_TOK_STRING_LITERAL ||
tok->type == SCC_TOK_CHAR_LITERAL) {
// 注意lex包含两端的引号需要跳过首尾转义内部字符
// 简化:暂不处理内部转义,直接追加
}
scc_cstring_append(&str, &tok->lexeme);
need_space = 0;
}
scc_cstring_append_ch(&str, '\"'); // 右引号
scc_lexer_tok_t result;
result.type = SCC_TOK_STRING_LITERAL;
result.lexeme = str;
return result;
}
static scc_lexer_tok_t concatenate_tokens(const scc_lexer_tok_t *left,
const scc_lexer_tok_t *right) {
Assert(left != null && right != null);
scc_cstring_t new_lex = scc_cstring_create();
scc_cstring_append(&new_lex, &left->lexeme);
scc_cstring_append(&new_lex, &right->lexeme);
scc_lexer_t lexer;
scc_sstream_t sstream;
// new_lex 所有权转移
scc_sstream_init_by_buffer(&sstream, scc_cstring_as_cstr(&new_lex),
scc_cstring_len(&new_lex), true, 8);
scc_lexer_init(&lexer, scc_sstream_to_ring(&sstream));
scc_lexer_tok_ring_t *ring = scc_lexer_to_ring(&lexer, 8, true);
scc_lexer_tok_t result;
int ok;
scc_ring_next_consume(*ring, result, ok);
if (!ok) {
scc_lexer_tok_drop(&result);
return result;
}
scc_ring_next_consume(*ring, result, ok);
if (ok) {
scc_lexer_tok_drop(&result);
return result;
}
scc_lexer_drop_ring(ring);
scc_lexer_drop(&lexer);
scc_sstream_drop(&sstream);
return result;
}
static inline void scc_copy_expand(scc_pproc_expand_t *expand_ctx,
scc_pproc_expand_t *copyed_ctx,
scc_lexer_tok_ring_t *ring) {
copyed_ctx->input = ring;
copyed_ctx->expanded_set = expand_ctx->expanded_set;
copyed_ctx->macro_table = expand_ctx->macro_table;
copyed_ctx->need_rescan = false;
scc_vec_init(copyed_ctx->output);
}
void scc_pproc_expand_by_src(scc_pproc_macro_table_t *macro_table,
scc_lexer_tok_ring_t *input,
scc_lexer_tok_ring_t *output,
const scc_pproc_macro_t *macro) {
scc_lexer_tok_vec_t expaned_buffer;
scc_vec_init(expaned_buffer);
int ok;
scc_lexer_tok_t tok;
scc_ring_next_consume(*input, tok, ok);
if (macro->type == SCC_PP_MACRO_NONE || ok == false) {
UNREACHABLE();
} else if (macro->type == SCC_PP_MACRO_OBJECT) {
scc_vec_push(expaned_buffer, tok);
} else if (macro->type == SCC_PP_MACRO_FUNCTION) {
scc_vec_push(expaned_buffer, tok);
scc_pproc_parse_macro_arguments(input, &expaned_buffer, true);
}
scc_pproc_expand_by_vec(macro_table, &expaned_buffer, output);
}
void scc_pproc_expand_by_vec(scc_pproc_macro_table_t *macro_table,
scc_lexer_tok_vec_t *input,
scc_lexer_tok_ring_t *output) {
scc_pproc_expand_t ctx;
scc_lexer_tok_ring_t ring = scc_lexer_array_to_ring(input);
ctx.input = &ring;
ctx.macro_table = macro_table;
ctx.need_rescan = false;
scc_vec_init(ctx.output);
scc_pproc_macro_table_t expanded_set;
ctx.expanded_set = &expanded_set;
scc_pproc_marco_table_init(ctx.expanded_set);
scc_pproc_expand_macro(&ctx);
*output = scc_lexer_array_to_ring(&ctx.output);
scc_pproc_macro_table_drop(ctx.expanded_set);
}
static inline void
split_arguments(scc_pproc_macro_extened_params_t *splited_params,
scc_lexer_tok_vec_t *raw_args, const scc_pproc_macro_t *macro) {
scc_lexer_tok_vec_t arg;
scc_vec_init(arg);
int named_count = (int)scc_vec_size(macro->params);
cbool is_variadic =
(named_count > 0 &&
scc_vec_at(macro->params, named_count - 1).type == SCC_TOK_ELLIPSIS);
int depth = 0;
scc_vec_foreach(*raw_args, i) {
scc_lexer_tok_t *raw_arg = &scc_vec_at(*raw_args, i);
if (raw_arg->type == SCC_TOK_L_PAREN) {
depth++;
} else if (raw_arg->type == SCC_TOK_R_PAREN) {
depth--;
}
if (depth != 0 || raw_arg->type != SCC_TOK_COMMA ||
(is_variadic && scc_vec_size(*splited_params) == named_count - 1)) {
if (scc_vec_size(arg) == 0 && raw_arg->type == SCC_TOK_BLANK) {
scc_lexer_tok_drop(raw_arg);
} else {
scc_vec_push(arg, *raw_arg);
}
continue;
} else {
scc_lexer_tok_drop(raw_arg);
if (scc_vec_size(arg) &&
scc_vec_at(arg, scc_vec_size(arg) - 1).type == SCC_TOK_BLANK) {
scc_lexer_tok_drop(&scc_vec_pop(arg));
}
scc_vec_push(*splited_params, arg);
scc_vec_init(arg);
}
}
if (scc_vec_size(arg) &&
scc_vec_at(arg, scc_vec_size(arg) - 1).type == SCC_TOK_BLANK) {
scc_lexer_tok_drop(&scc_vec_pop(arg));
}
scc_vec_push(*splited_params, arg);
if (is_variadic && scc_vec_size(*splited_params) == named_count - 1) {
scc_vec_init(arg);
scc_vec_push(*splited_params, arg);
}
}
static inline void
expand_arguments(scc_pproc_macro_extened_params_t *expanded_params,
scc_pproc_macro_extened_params_t *splited_params,
scc_pproc_expand_t *expand_ctx) {
scc_vec_foreach(*splited_params, i) {
scc_pproc_expand_t ctx;
scc_lexer_tok_vec_t splite_param = scc_vec_at(*splited_params, i);
scc_lexer_tok_vec_t expanded_param;
scc_vec_init(expanded_param);
scc_vec_foreach(splite_param, j) {
scc_lexer_tok_t tok = scc_vec_at(splite_param, j);
tok.lexeme = scc_cstring_copy(&tok.lexeme);
scc_vec_push(expanded_param, tok);
}
scc_lexer_tok_ring_t ring = scc_lexer_array_to_ring(&expanded_param);
scc_copy_expand(expand_ctx, &ctx, &ring);
scc_pproc_expand_macro(&ctx);
scc_ring_free(ring);
scc_vec_push(*expanded_params, ctx.output);
}
}
static inline void
expanded_params_free(scc_pproc_macro_extened_params_t *expanded_params) {
scc_vec_foreach(*expanded_params, i) {
scc_lexer_tok_vec_t expanded_param = scc_vec_at(*expanded_params, i);
scc_vec_foreach(expanded_param, j) {
scc_lexer_tok_t tok = scc_vec_at(expanded_param, j);
scc_lexer_tok_drop(&tok);
}
scc_vec_free(expanded_param);
}
scc_vec_free(*expanded_params);
}
static void rescan(scc_pproc_expand_t *expand_ctx,
const scc_pproc_macro_t *macro,
scc_lexer_tok_vec_t *tok_buffer) {
scc_pproc_macro_t *expanded_macro =
scc_pproc_macro_new(&macro->name, macro->type);
if (expanded_macro == null) {
LOG_FATAL("Out of memory");
}
scc_pproc_macro_table_set(expand_ctx->expanded_set, expanded_macro);
scc_pproc_expand_t rescan_ctx;
scc_lexer_tok_ring_t ring = scc_lexer_array_to_ring(tok_buffer);
scc_copy_expand(expand_ctx, &rescan_ctx, &ring);
scc_pproc_expand_macro(&rescan_ctx);
scc_ring_free(ring);
scc_vec_foreach(rescan_ctx.output, i) {
scc_vec_push(expand_ctx->output, scc_vec_at(rescan_ctx.output, i));
}
scc_pproc_macro_table_remove(expand_ctx->expanded_set, &macro->name);
}
static int find_params(const scc_lexer_tok_t *tok,
const scc_pproc_macro_t *macro) {
scc_vec_foreach(macro->params, j) {
if (scc_cstring_cmp(&(tok->lexeme),
&(scc_vec_at(macro->params, j).lexeme)) == 0) {
return j;
}
}
return -1;
}
static inline int got_left_non_blank(int i,
const scc_lexer_tok_vec_t *replaces) {
int left_idx = i - 1;
while (left_idx >= 0 &&
scc_vec_at(*replaces, left_idx).type == SCC_TOK_BLANK) {
left_idx--;
}
return left_idx;
}
static inline int got_right_non_blank(int i,
const scc_lexer_tok_vec_t *replaces) {
int right_idx = i + 1;
while (right_idx < (int)scc_vec_size(*replaces) &&
scc_vec_at(*replaces, right_idx).type == SCC_TOK_BLANK) {
right_idx++;
}
return right_idx;
}
static inline void expand_function_macro(scc_pproc_expand_t *expand_ctx,
const scc_pproc_macro_t *macro) {
scc_lexer_tok_vec_t tok_buffer;
scc_vec_init(tok_buffer);
Assert(macro->type == SCC_PP_MACRO_FUNCTION);
scc_lexer_tok_vec_t raw_args;
scc_vec_init(raw_args);
scc_pproc_parse_macro_arguments(expand_ctx->input, &raw_args, false);
// collect, fill and expand arg
scc_pproc_macro_extened_params_t splited_params;
scc_vec_init(splited_params);
split_arguments(&splited_params, &raw_args, macro);
scc_pproc_macro_extened_params_t expanded_params;
scc_vec_init(expanded_params);
expand_arguments(&expanded_params, &splited_params, expand_ctx);
// replace
scc_vec_foreach(macro->replaces, i) {
scc_lexer_tok_t tok =
scc_lexer_tok_copy(&scc_vec_at(macro->replaces, i));
if (tok.type == SCC_TOK_BLANK) {
scc_cstring_free(&tok.lexeme);
tok.lexeme = scc_cstring_from_cstr(" ");
scc_vec_push(tok_buffer, tok);
continue;
}
if (tok.type == SCC_TOK_SHARP) {
// # stringify
scc_lexer_tok_drop(&tok);
int right_idx = got_right_non_blank(i, &macro->replaces);
if (right_idx >= (int)macro->replaces.size) {
LOG_WARN("generate empty stringify");
scc_cstring_free(&tok.lexeme);
tok.lexeme = scc_cstring_from_cstr("");
scc_vec_push(tok_buffer, tok);
break;
}
int j = find_params(&scc_vec_at(macro->replaces, right_idx), macro);
Assert(j != -1 && j < (int)scc_vec_size(splited_params));
tok = stringify_argument(&scc_vec_at(splited_params, j));
scc_vec_push(tok_buffer, tok);
i = right_idx;
continue;
} else if (tok.type == SCC_TOK_SHARP_SHARP) {
// ## contact
scc_lexer_tok_drop(&tok);
int left_idx = got_left_non_blank(i, &macro->replaces);
int right_idx = got_right_non_blank(i, &macro->replaces);
if (left_idx < 0 || right_idx >= (int)macro->replaces.size) {
LOG_FATAL("Invalid ## operator");
}
while (i++ < right_idx) {
scc_lexer_tok_drop(&scc_vec_pop(tok_buffer));
}
scc_lexer_tok_t *left_tok = &scc_vec_at(macro->replaces, left_idx);
scc_lexer_tok_t *right_tok =
&scc_vec_at(macro->replaces, right_idx);
if (left_tok->type == SCC_TOK_COMMA &&
scc_strcmp(scc_cstring_as_cstr(&(right_tok->lexeme)),
"__VA_ARGS__") == 0) {
// GNU 扩展:处理逗号删除
int right_param_idx = find_params(right_tok, macro);
Assert(right_param_idx != -1);
scc_lexer_tok_vec_t right_vec =
scc_vec_at(expanded_params, right_param_idx);
if (scc_vec_size(right_vec) != 0) {
// 可变参数非空:输出逗号副本,然后输出右侧参数的展开
scc_lexer_tok_t comma_tok = scc_lexer_tok_copy(left_tok);
scc_vec_push(tok_buffer, comma_tok);
}
scc_vec_foreach(right_vec, k) {
scc_lexer_tok_t tok =
scc_lexer_tok_copy(&scc_vec_at(right_vec, k));
scc_vec_push(tok_buffer, tok);
}
i = right_idx;
continue;
}
int idx;
idx = find_params(left_tok, macro);
scc_lexer_tok_vec_t left_vec;
if (idx != -1) {
Assert(idx < (int)scc_vec_size(splited_params));
left_vec = scc_vec_at(splited_params, idx);
} else {
scc_vec_init(left_vec);
scc_vec_push(left_vec, scc_lexer_tok_copy(left_tok));
}
idx = find_params(right_tok, macro);
scc_lexer_tok_vec_t right_vec;
if (idx != -1) {
Assert(idx < (int)scc_vec_size(splited_params));
right_vec = scc_vec_at(splited_params, idx);
} else {
scc_vec_init(right_vec);
scc_vec_push(right_vec, scc_lexer_tok_copy(right_tok));
}
scc_lexer_tok_t *left =
scc_vec_size(left_vec)
? &scc_vec_at(left_vec, scc_vec_size(left_vec) - 1)
: null;
scc_lexer_tok_t *right =
scc_vec_size(right_vec) ? &scc_vec_at(right_vec, 0) : null;
scc_vec_foreach(left_vec, k) {
if (k + 1 >= scc_vec_size(left_vec)) {
continue;
}
scc_lexer_tok_t tok =
scc_lexer_tok_copy(&scc_vec_at(left_vec, k));
scc_vec_push(tok_buffer, tok);
}
scc_lexer_tok_t concate_tok = concatenate_tokens(left, right);
if (concate_tok.type == SCC_TOK_UNKNOWN) {
LOG_FATAL("Invalid ## token");
}
scc_vec_push(tok_buffer, concate_tok);
scc_vec_foreach(right_vec, k) {
if (k == 0) {
continue;
}
scc_lexer_tok_t tok =
scc_lexer_tok_copy(&scc_vec_at(right_vec, k));
scc_vec_push(tok_buffer, tok);
}
i = right_idx;
continue;
} else {
int j = find_params(&tok, macro);
if (j != -1) {
Assert(j < (int)scc_vec_size(expanded_params));
scc_lexer_tok_vec_t expanded_param =
scc_vec_at(expanded_params, j);
scc_lexer_tok_drop(&tok);
scc_vec_foreach(expanded_param, k) {
tok = scc_lexer_tok_copy(&scc_vec_at(expanded_param, k));
scc_vec_push(tok_buffer, tok);
}
continue;
}
}
scc_vec_push(tok_buffer, tok);
}
expanded_params_free(&splited_params);
expanded_params_free(&expanded_params);
rescan(expand_ctx, macro, &tok_buffer);
}
static inline void expand_object_macro(scc_pproc_expand_t *expand_ctx,
const scc_pproc_macro_t *macro) {
scc_lexer_tok_vec_t tok_buffer;
scc_vec_init(tok_buffer);
scc_vec_foreach(macro->replaces, i) {
scc_lexer_tok_t tok =
scc_lexer_tok_copy(&scc_vec_at(macro->replaces, i));
if (tok.type == SCC_TOK_BLANK) {
scc_cstring_free(&tok.lexeme);
tok.lexeme = scc_cstring_from_cstr(" ");
} else if (tok.type == SCC_TOK_SHARP_SHARP) {
// ## contact
int left_idx = got_left_non_blank(i, &macro->replaces);
int right_idx = got_right_non_blank(i, &macro->replaces);
if (left_idx < 0 ||
right_idx >= (int)scc_vec_size(macro->replaces)) {
LOG_FATAL("Invalid ## operator");
}
scc_lexer_tok_t *left = &scc_vec_at(macro->replaces, left_idx);
scc_lexer_tok_t *right = &scc_vec_at(macro->replaces, right_idx);
scc_lexer_tok_t concate_tok = concatenate_tokens(left, right);
while (i++ < right_idx) {
scc_lexer_tok_drop(&scc_vec_pop(tok_buffer));
}
if (concate_tok.type == SCC_TOK_UNKNOWN) {
LOG_FATAL("Invalid ## token");
}
scc_vec_push(tok_buffer, concate_tok);
i = right_idx;
continue;
}
scc_vec_push(tok_buffer, tok);
}
rescan(expand_ctx, macro, &tok_buffer);
}
void scc_pproc_expand_macro(scc_pproc_expand_t *expand_ctx) {
int ok;
scc_lexer_tok_t tok;
while (1) {
scc_ring_next_consume(*expand_ctx->input, tok, ok);
if (!ok) {
return;
}
if (tok.type != SCC_TOK_IDENT) {
scc_vec_push(expand_ctx->output, tok);
continue;
}
// maybe expanded
scc_pproc_macro_t *macro =
scc_pproc_macro_table_get(expand_ctx->macro_table, &tok.lexeme);
if (macro == null ||
scc_pproc_macro_table_get(expand_ctx->expanded_set, &macro->name)) {
scc_vec_push(expand_ctx->output, tok);
continue;
}
expand_ctx->need_rescan = true;
if (macro->type == SCC_PP_MACRO_OBJECT) {
expand_object_macro(expand_ctx, macro);
} else if (macro->type == SCC_PP_MACRO_FUNCTION) {
scc_lexer_tok_t expect_tok;
scc_ring_peek(*expand_ctx->input, expect_tok, ok);
if (ok == false || expect_tok.type != SCC_TOK_L_PAREN) {
scc_vec_push(expand_ctx->output, tok);
continue;
}
expand_function_macro(expand_ctx, macro);
} else {
UNREACHABLE();
}
}
if (expand_ctx->need_rescan) {
expand_ctx->need_rescan = false;
scc_pproc_expand_t rescan_ctx;
scc_lexer_tok_ring_t ring =
scc_lexer_array_to_ring(&expand_ctx->output);
scc_copy_expand(expand_ctx, &rescan_ctx, &ring);
scc_pproc_expand_macro(&rescan_ctx);
scc_ring_free(ring);
expand_ctx->output = rescan_ctx.output;
}
}

173
libs/pproc/src/pproc_if.c Normal file
View File

@@ -0,0 +1,173 @@
#include <scc_lexer_utils.h>
#include <scc_pproc.h>
cbool scc_pproc_parse_if_defined(scc_pproc_t *pp, scc_tok_type_t type,
const scc_lexer_tok_t *tok) {
int defined = 0;
if (tok) {
defined = (scc_pproc_macro_table_get(&pp->macro_table,
&(tok->lexeme)) != null);
}
switch (type) {
case SCC_PP_TOK_IFDEF:
case SCC_PP_TOK_IFNDEF: {
cbool condition = (type == SCC_PP_TOK_IFDEF) ? defined : !defined;
scc_pproc_if_t new_if;
new_if.found_true = condition;
new_if.seen_else = 0;
new_if.active = pp->enable ? condition : 0;
scc_vec_push(pp->if_stack, new_if);
pp->enable = new_if.active;
break;
}
case SCC_PP_TOK_ELIFDEF:
case SCC_PP_TOK_ELIFNDEF: {
if (scc_vec_size(pp->if_stack) == 0) {
LOG_ERROR("#elif without #if");
return false;
}
scc_pproc_if_t *top =
&scc_vec_at(pp->if_stack, scc_vec_size(pp->if_stack) - 1);
if (top->seen_else) {
LOG_ERROR("#elif after #else");
return false;
}
int condition = (type == SCC_PP_TOK_ELIFDEF) ? defined : !defined;
if (top->found_true) {
// 前面已有真分支,本 elif 不激活
top->active = 0;
} else {
if (condition) {
top->active = 1;
top->found_true = 1;
} else {
top->active = 0;
}
}
// seen_else 仍为 0
pp->enable = top->active;
break;
}
case SCC_PP_TOK_ELSE: {
if (scc_vec_size(pp->if_stack) == 0) {
LOG_ERROR("#else without #if");
return false;
}
scc_pproc_if_t *top =
&scc_vec_at(pp->if_stack, scc_vec_size(pp->if_stack) - 1);
if (top->seen_else) {
LOG_ERROR("multiple #else");
return false;
}
if (top->found_true) {
top->active = 0;
} else {
top->active = 1;
top->found_true = 1;
}
top->seen_else = 1;
pp->enable = top->active;
break;
}
case SCC_PP_TOK_ENDIF: {
if (scc_vec_size(pp->if_stack) == 0) {
LOG_ERROR("#endif without #if");
} else {
scc_vec_pop(pp->if_stack);
}
if (scc_vec_size(pp->if_stack) == 0) {
pp->enable = 1;
} else {
pp->enable =
scc_vec_at(pp->if_stack, scc_vec_size(pp->if_stack) - 1).active;
}
break;
}
default: {
LOG_FATAL("unexpected directive");
}
}
return true;
}
static cbool parse_constant_condition() { return false; }
cbool scc_pproc_parse_if_condition(scc_pproc_t *pp, scc_tok_type_t type,
scc_lexer_tok_ring_t *tok_ring) {
// TODO
int ok;
scc_lexer_tok_t tok = {0};
ok = scc_lexer_next_non_blank(tok_ring, &tok);
if (ok == false) {
LOG_FATAL("unexpected EOF");
}
int condition = 0;
if (tok.type == SCC_TOK_INT_LITERAL) {
condition = scc_cstring_as_cstr(&tok.lexeme)[0] == '0' ? 0 : 1;
} else {
LOG_ERROR("expected integer constant but got %s",
scc_cstring_as_cstr(&tok.lexeme));
}
scc_lexer_tok_drop(&tok);
ok = scc_lexer_next_non_blank(tok_ring, &tok);
if (ok == false) {
LOG_FATAL("unexpected EOF");
}
if (tok.type != SCC_TOK_ENDLINE) {
LOG_ERROR("expected endline");
scc_lexer_skip_until_newline(tok_ring);
} else {
scc_lexer_tok_drop(&tok);
}
scc_ring_free(*tok_ring);
// 根据指令类型更新条件编译栈
switch (type) {
case SCC_PP_TOK_IF: {
scc_pproc_if_t new_if;
new_if.found_true = condition;
new_if.seen_else = 0;
new_if.active = pp->enable ? condition : 0;
scc_vec_push(pp->if_stack, new_if);
pp->enable = new_if.active;
break;
}
case SCC_PP_TOK_ELIF: {
if (scc_vec_size(pp->if_stack) == 0) {
LOG_ERROR("#elif without #if");
return false;
}
scc_pproc_if_t *top =
&scc_vec_at(pp->if_stack, scc_vec_size(pp->if_stack) - 1);
if (top->seen_else) {
LOG_ERROR("#elif after #else");
return false;
}
if (top->found_true) {
top->active = 0;
} else {
if (condition) {
top->active = 1;
top->found_true = 1;
} else {
top->active = 0;
}
}
pp->enable = top->active;
break;
}
default:
LOG_FATAL("unexpected directive in parse_if_condition");
}
return true;
return true;
}

View File

@@ -0,0 +1,113 @@
#include <pproc_expand.h>
#include <scc_core_ring.h>
#include <scc_pproc.h>
static int switch_file_stack(scc_pproc_t *pp, scc_cstring_t *fname,
scc_pos_t *pos, int is_system) {
scc_cstring_t fpath = scc_cstring_create();
int ret = 0;
const char *org_fname = pos->name;
if (!is_system) {
const char parent[] = "/../";
// FIXME maybe it can eazy
scc_cstring_append_cstr(&fpath, org_fname, scc_strlen(org_fname));
scc_cstring_append_cstr(&fpath, parent, scc_strlen(parent));
scc_cstring_append(&fpath, fname);
ret = scc_fexists(scc_cstring_as_cstr(&fpath));
if (ret == true) {
goto FOPEN;
}
}
/* system default path and -I includes path */
scc_vec_foreach(pp->include_paths, i) {
scc_cstring_free(&fpath);
scc_cstring_t *syspath = &scc_vec_at(pp->include_paths, i);
scc_cstring_append(&fpath, syspath);
scc_cstring_append_ch(&fpath, '/');
scc_cstring_append(&fpath, fname);
ret = scc_fexists(scc_cstring_as_cstr(&fpath));
if (ret == true) {
goto FOPEN;
}
}
LOG_ERROR("In %s:%d:%d include %c%s%c, the file is not found", org_fname,
pos->line, pos->col, is_system ? '<' : '\"',
scc_cstring_as_cstr(fname), is_system ? '>' : '\"');
return -1;
FOPEN:
if (scc_vec_size(pp->file_stack) >= pp->config.max_include_depth) {
LOG_FATAL("Include depth is too deep, the include depth is %d, set "
"MAX_INCLUDE_DEPTH is %d",
scc_vec_size(pp->file_stack), pp->config.max_include_depth);
}
scc_pproc_file_t *file = scc_malloc(sizeof(scc_pproc_file_t));
Assert(file != null);
if (scc_sstream_init(&(file->sstream), scc_cstring_as_cstr(&fpath), 1024)) {
return -1;
}
scc_lexer_init(&(file->lexer), scc_sstream_to_ring(&(file->sstream)));
file->ring = scc_lexer_to_ring(&(file->lexer), 8, true);
pp->cur_ring = file->ring;
scc_vec_push(pp->file_stack, file);
return 0;
}
void scc_pproc_parse_include(scc_pproc_t *pp, scc_lexer_tok_t *include_tok,
scc_lexer_tok_ring_t *tok_ring) {
int ok;
scc_lexer_tok_t tok;
scc_pos_t pos = include_tok->loc;
scc_lexer_tok_drop(include_tok);
scc_cstring_t line = scc_cstring_create();
while (1) {
scc_ring_next_consume(*tok_ring, tok, ok);
if (!ok)
break;
if (scc_get_tok_subtype(tok.type) != SCC_TOK_SUBTYPE_EMPTYSPACE &&
scc_get_tok_subtype(tok.type) != SCC_TOK_SUBTYPE_COMMENT) {
scc_cstring_append(&line, &tok.lexeme);
}
scc_lexer_tok_drop(&tok);
}
scc_ring_free(*tok_ring);
const char *includename = scc_cstring_as_cstr(&line);
int len = scc_cstring_len(&line);
if (len < 2) {
goto ERROR;
} else if (len == 2) {
goto ERROR;
} else {
if (includename[0] == '\"') {
if (includename[len - 1] != '\"') {
goto ERROR;
}
} else if (includename[0] == '<') {
if (includename[len - 1] != '>') {
goto ERROR;
}
} else {
goto ERROR;
}
}
scc_cstring_t fname = scc_cstring_create();
for (int i = 1; i < len - 1; i++) {
scc_cstring_append_ch(&fname, includename[i]);
}
scc_cstring_free(&line);
int is_system = includename[0] == '<';
if (switch_file_stack(pp, &fname, &pos, is_system)) {
// LOG_ERROR()
}
scc_cstring_free(&fname);
return;
ERROR:
LOG_ERROR("Invalid include filename need \"FILENAME\" or <FILENAME>");
scc_cstring_free(&line);
}

View File

@@ -0,0 +1,157 @@
#include <pproc_macro.h>
// 创建宏对象
scc_pproc_macro_t *scc_pproc_macro_new(const scc_cstring_t *name,
scc_pproc_macro_type_t type) {
scc_pproc_macro_t *macro = scc_malloc(sizeof(scc_pproc_macro_t));
if (!macro) {
LOG_ERROR("Failed to allocate memory for macro");
return null;
}
macro->name = scc_cstring_copy(name);
macro->type = type;
scc_vec_init(macro->params);
scc_vec_init(macro->replaces);
return macro;
}
// 销毁宏对象
void scc_pproc_macro_drop(scc_pproc_macro_t *macro) {
if (!macro)
return;
// 释放参数列表
for (usize i = 0; i < macro->params.size; ++i) {
scc_lexer_tok_drop(&scc_vec_at(macro->params, i));
}
scc_vec_free(macro->params);
// 释放替换列表
for (usize i = 0; i < macro->replaces.size; ++i) {
scc_lexer_tok_drop(&scc_vec_at(macro->replaces, i));
}
scc_vec_free(macro->replaces);
scc_cstring_free(&macro->name);
scc_free(macro);
}
// 添加对象宏
cbool scc_pproc_add_object_macro(scc_pproc_macro_table_t *macros,
const scc_cstring_t *name,
const scc_lexer_tok_vec_t *replacement) {
if (!macros || !name || !replacement)
return false;
scc_pproc_macro_t *macro = scc_pproc_macro_new(name, SCC_PP_MACRO_OBJECT);
if (!macro)
return false;
macro->replaces = *replacement;
// 检查是否已存在同名宏
scc_pproc_macro_t *existing =
scc_hashtable_get(&macros->table, &macro->name);
if (existing) {
LOG_WARN("Redefining macro: %s", scc_cstring_as_cstr(&macro->name));
scc_pproc_macro_drop(existing);
}
scc_hashtable_set(&macros->table, &macro->name, macro);
return true;
}
// 添加函数宏
cbool scc_pproc_add_function_macro(scc_pproc_macro_table_t *macros,
const scc_cstring_t *name,
const scc_lexer_tok_vec_t *params,
const scc_lexer_tok_vec_t *replacement) {
if (!macros || !name || !params || !replacement)
return false;
scc_pproc_macro_t *macro = scc_pproc_macro_new(name, SCC_PP_MACRO_FUNCTION);
if (!macro)
return false;
// 复制参数列表
macro->params = *params;
macro->replaces = *replacement;
// 检查是否已存在同名宏
scc_pproc_macro_t *existing =
scc_hashtable_get(&macros->table, &macro->name);
if (existing) {
LOG_WARN("Redefining macro: %s", scc_cstring_as_cstr(&macro->name));
scc_pproc_macro_drop(existing);
}
scc_hashtable_set(&macros->table, &macro->name, macro);
return true;
}
/// marco_table
scc_pproc_macro_t *scc_pproc_macro_table_set(scc_pproc_macro_table_t *pp,
scc_pproc_macro_t *macro) {
Assert(pp != null && macro != null);
return scc_hashtable_set(&pp->table, &macro->name, macro);
}
// 查找宏定义
scc_pproc_macro_t *scc_pproc_macro_table_get(scc_pproc_macro_table_t *pp,
const scc_cstring_t *name) {
return scc_hashtable_get(&pp->table, name);
}
// 从预处理器中删除宏
cbool scc_pproc_macro_table_remove(scc_pproc_macro_table_t *pp,
const scc_cstring_t *name) {
if (!pp || !name)
return false;
scc_pproc_macro_t *macro = scc_hashtable_get(&pp->table, name);
if (!macro)
return false;
scc_hashtable_del(&pp->table, name);
scc_pproc_macro_drop(macro);
return true;
}
static u32 hash_func(const void *key) {
const scc_cstring_t *string = (const scc_cstring_t *)key;
return scc_strhash32(scc_cstring_as_cstr(string));
}
static int hash_cmp(const void *key1, const void *key2) {
const scc_cstring_t *str1 = (const scc_cstring_t *)key1;
const scc_cstring_t *str2 = (const scc_cstring_t *)key2;
if (str1->size != str2->size) {
return str1->size - str2->size;
}
return scc_strcmp(scc_cstring_as_cstr(str1), scc_cstring_as_cstr(str2));
}
void scc_pproc_marco_table_init(scc_pproc_macro_table_t *macros) {
Assert(macros != null);
macros->table.hash_func = hash_func;
macros->table.key_cmp = hash_cmp;
scc_hashtable_init(&macros->table);
}
static int macro_free(const void *key, void *value, void *context) {
(void)key;
(void)context;
scc_pproc_macro_drop(value);
return 0;
}
void scc_pproc_macro_table_drop(scc_pproc_macro_table_t *macros) {
Assert(macros != null);
scc_hashtable_foreach(&macros->table, macro_free, null);
scc_hashtable_drop(&macros->table);
}

136
libs/pproc/src/scc_pproc.c Normal file
View File

@@ -0,0 +1,136 @@
#include <pproc_expand.h>
#include <scc_lexer_utils.h>
#include <scc_pproc.h>
static int pproc_next_one_file(scc_pproc_t *pp, scc_lexer_tok_t *out) {
CONTINUE:
scc_lexer_tok_ring_t *stream = pp->cur_ring;
scc_lexer_tok_t tok = {0};
int ok = 0;
if (pp->expanded_ring.cap) {
scc_ring_next_consume(pp->expanded_ring, *out, ok);
if (ok == false) {
scc_ring_free(pp->expanded_ring);
goto CONTINUE;
} else {
return true;
}
}
scc_ring_peek(*stream, tok, ok);
if (ok == false) {
return false;
}
if (tok.type == SCC_TOK_ENDLINE) {
scc_ring_next_consume(*stream, *out, ok);
pp->at_line_start = true;
return true;
}
if (pp->at_line_start) {
if (tok.type == SCC_TOK_SHARP) {
// parse to #
scc_pproc_handle_directive(pp);
pp->at_line_start = true;
goto CONTINUE;
} else if (tok.type == SCC_TOK_BLANK) {
scc_ring_next(*stream, *out, ok);
scc_ring_peek(*stream, tok, ok);
if (ok && tok.type == SCC_TOK_SHARP) {
scc_pproc_handle_directive(pp);
pp->at_line_start = true;
goto CONTINUE;
}
scc_ring_back(*stream, ok);
Assert(ok == true);
scc_ring_peek(*stream, tok, ok);
}
}
if (pp->enable == false) {
scc_lexer_skip_until_newline(stream);
pp->at_line_start = true;
goto CONTINUE;
}
if (tok.type == SCC_TOK_IDENT) {
// maybe expanded
scc_pproc_macro_t *macro =
scc_pproc_macro_table_get(&pp->macro_table, &tok.lexeme);
if (macro == null) {
scc_ring_next_consume(*stream, *out, ok);
return ok;
}
scc_pproc_expand_by_src(&pp->macro_table, pp->cur_ring,
&pp->expanded_ring, macro);
goto CONTINUE;
} else {
// continue
scc_ring_next_consume(*stream, *out, ok);
return ok;
}
return false;
}
static int pproc_next(scc_pproc_t *pp, scc_lexer_tok_t *tok) {
CONTINUE:
int ret = pproc_next_one_file(pp, tok);
if (ret != 0) {
return true;
}
if (scc_vec_size(pp->file_stack) == 0) {
return false;
}
scc_pproc_file_t *file = scc_vec_pop(pp->file_stack);
Assert(file->ring == pp->cur_ring);
scc_lexer_drop_ring(file->ring);
scc_lexer_drop(&(file->lexer));
scc_sstream_drop(&(file->sstream));
scc_free(file);
if (scc_vec_size(pp->file_stack) == 0) {
pp->cur_ring = pp->org_ring;
} else {
pp->cur_ring =
scc_vec_at(pp->file_stack, scc_vec_size(pp->file_stack) - 1)->ring;
}
goto CONTINUE;
}
void scc_pproc_init(scc_pproc_t *pp, scc_lexer_tok_ring_t *input) {
Assert(pp != null && input != null);
pp->org_ring = input;
pp->cur_ring = pp->org_ring;
scc_ring_init(pp->expanded_ring, 0, 0, 0);
scc_pproc_marco_table_init(&pp->macro_table);
scc_vec_init(pp->include_paths);
scc_vec_init(pp->if_stack);
scc_vec_init(pp->file_stack);
pp->at_line_start = true;
pp->enable = true;
pp->config.max_include_depth = 32;
}
void scc_pproc_add_builtin_macros() {
// TODO
}
static cbool fill_token(scc_lexer_tok_t *tok, void *userdata) {
scc_pproc_t *pp = userdata;
return pproc_next(pp, tok);
}
scc_lexer_tok_ring_t *scc_pproc_to_ring(scc_pproc_t *pp, int ring_size) {
scc_ring_init(pp->ring, ring_size, fill_token, pp);
pp->ring_ref_count++;
return &pp->ring;
}
// 销毁预处理器
void scc_pproc_drop(scc_pproc_t *pp) {
if (pp == null)
return;
scc_lexer_drop_ring(pp->cur_ring);
scc_pproc_macro_table_drop(&pp->macro_table);
}

View File

@@ -0,0 +1,550 @@
#include <assert.h>
#include <scc_pproc.h>
#include <string.h>
#include <utest/acutest.h>
static cbool process_input(const char *input, scc_cstring_t *output) {
int ret = 0;
scc_sstream_t mem_stream;
ret = scc_sstream_init_by_buffer(&mem_stream, input, strlen(input), false,
16);
Assert(ret == 0);
scc_lexer_t lexer;
scc_lexer_init(&lexer, scc_sstream_to_ring(&mem_stream));
scc_pproc_t pp;
scc_pproc_init(&pp, scc_lexer_to_ring(&lexer, 8, true));
scc_lexer_tok_ring_t *tok_ring = scc_pproc_to_ring(&pp, 8);
*output = scc_cstring_from_cstr("");
scc_lexer_tok_t tok;
while (1) {
scc_ring_next_consume(*tok_ring, tok, ret);
if (!ret) {
break;
}
scc_cstring_append(output, &tok.lexeme);
scc_lexer_tok_drop(&tok);
}
scc_pproc_drop(&pp);
scc_lexer_drop(&lexer);
scc_sstream_drop(&mem_stream);
return true;
}
#define CHECK_PP_OUTPUT_EXACT(input, expect) \
do { \
scc_cstring_t output; \
process_input(input, &output); \
assert(output.data != NULL); \
TEST_CHECK(strcmp(output.data, expect) == 0); \
TEST_MSG("Expected: %s", expect); \
TEST_MSG("Produced: %s", output.data); \
} while (0)
#define CHECK_PP_OUTPUT_CONTAIN(input, expect) \
do { \
scc_cstring_t output; \
process_input(input, &output); \
assert(output.data != NULL); \
TEST_CHECK(strstr(output.data, expect) != NULL); \
TEST_MSG("Expected: %s", expect); \
TEST_MSG("Produced: %s", output.data); \
} while (0)
static void test_define_simple_no_macro(void) {
TEST_CASE("simple no macro");
CHECK_PP_OUTPUT_EXACT("a", "a");
CHECK_PP_OUTPUT_EXACT("a()", "a()");
CHECK_PP_OUTPUT_EXACT("a(b)", "a(b)");
CHECK_PP_OUTPUT_EXACT("a(b, c)", "a(b, c)");
CHECK_PP_OUTPUT_EXACT("a(b, c, d)", "a(b, c, d)");
}
static void test_define_simple_object_macro(void) {
TEST_CASE("simple object-like macro");
CHECK_PP_OUTPUT_EXACT("#define MAX 100\nMAX\n", "100\n");
CHECK_PP_OUTPUT_EXACT("#define NAME test\r\nNAME\n", "test\n");
}
static void test_define_complex_object_macro(void) {
TEST_CASE("complex object-like macro");
CHECK_PP_OUTPUT_EXACT("#define VALUE (100 + 50)\nVALUE\n", "(100 + 50)\n");
CHECK_PP_OUTPUT_EXACT("#define PI 3.14159\nPI\n", "3.14159\n");
}
static void test_define_object_macro_backspace(void) {
TEST_CASE("object-like macro check backspace");
CHECK_PP_OUTPUT_EXACT("#define MAX 100\nMAX\n", "100\n");
CHECK_PP_OUTPUT_EXACT("#define NAME \ttest\r\nNAME\n", "test\n");
CHECK_PP_OUTPUT_EXACT("#define \tVALUE (100 \t+ 50)\nVALUE\n",
"(100 + 50)\n");
CHECK_PP_OUTPUT_EXACT("#define \tPI \t 3.14159\nPI\n", "3.14159\n");
}
static void test_define_function_macro(void) {
TEST_CASE("function-like macro");
CHECK_PP_OUTPUT_EXACT("#define ADD(a,b) a + b\nADD(1, 2)\n", "1 + 2\n");
CHECK_PP_OUTPUT_EXACT("#define ADD( a , b ) a + b\nADD(1, 2)\n", "1 + 2\n");
CHECK_PP_OUTPUT_EXACT(
"#define MAX(a,b) ((a) > (b) ? (a) : (b))\nMAX(10, 20)\n",
"((10) > (20) ? (10) : (20))\n");
}
static void test_define_stringify_operator(void) {
TEST_CASE("stringify operator (#)");
CHECK_PP_OUTPUT_EXACT("#define STRINGIFY(x) #x\nSTRINGIFY(hello)\n",
"\"hello\"\n");
CHECK_PP_OUTPUT_EXACT("#define STR(x) #x\nSTR(test value)\n",
"\"test value\"\n");
CHECK_PP_OUTPUT_EXACT("#define STR(x) #x\nSTR(A B \"ab\")\n",
"\"A B \"ab\"\"\n");
CHECK_PP_OUTPUT_EXACT("#define STR(x) # x\nSTR(A B \"ab\")\n",
"\"A B \"ab\"\"\n");
}
static void test_define_concat_operator(void) {
TEST_CASE("concatenation operator (##)");
CHECK_PP_OUTPUT_EXACT("#define CONCAT a##b\nCONCAT\n", "ab\n");
CHECK_PP_OUTPUT_EXACT("#define CONCAT a ## b\nCONCAT\n", "ab\n");
CHECK_PP_OUTPUT_EXACT("#define CONCAT(a,b) a##b\nCONCAT(hello,world)\n",
"helloworld\n");
CHECK_PP_OUTPUT_EXACT(
"#define CONCAT( a , b ) a ## b\nCONCAT( hello , world )\n",
"helloworld\n");
CHECK_PP_OUTPUT_EXACT("#define JOIN(pre,suf) pre ## suf\nJOIN(var, 123)\n",
"var123\n");
}
static void test_define_nested_macros(void) {
TEST_CASE("nested macros");
CHECK_PP_OUTPUT_EXACT(
"#define MAX 100\n#define TWICE_MAX (MAX * 2)\nTWICE_MAX\n",
"(100 * 2)\n");
CHECK_PP_OUTPUT_EXACT(
"#define A 1\n#define B (A + 1)\n#define C (B + 1)\nC\n",
"((1 + 1) + 1)\n");
CHECK_PP_OUTPUT_EXACT("#define A\n", "");
CHECK_PP_OUTPUT_EXACT("#undef A\n", "");
CHECK_PP_OUTPUT_EXACT(" # define A 1\nA", "1");
// CHECK_PP_OUTPUT_EXACT(" # define A 1 \nA", "1"); // TODO
CHECK_PP_OUTPUT_EXACT("#define CONCAT(str) __scc_##str\nCONCAT(int)",
"__scc_int");
CHECK_PP_OUTPUT_EXACT("#define CONCAT(str) str##_scc__\nCONCAT(int)",
"int_scc__");
CHECK_PP_OUTPUT_EXACT("#define CONCAT(str) __scc_ ## str\nCONCAT(int)",
"__scc_int");
// TEST_CASE("TODO"); /*FALSE*/
// CHECK_PP_OUTPUT_EXACT("#define str(x) # x\n"
// "str()\n",
// "\"\"\n");
TEST_CASE("TODO");
CHECK_PP_OUTPUT_EXACT("#define x 1\n"
"#define f(a) f(x * (a))\n"
"f(0)\n"
"f(x)",
"f(1 * (0))\n"
"f(1 * (1))");
CHECK_PP_OUTPUT_EXACT("#define x x(0)\n"
"#define f(a) f(x * (a))\n"
"f(f(0))\n"
"f(f(x))\n"
"f(f(a))\n",
"f(x(0) * (f(x(0) * (0))))\n"
"f(x(0) * (f(x(0) * (x(0)))))\n"
"f(x(0) * (f(x(0) * (a))))\n");
}
static void test_undef_macros(void) {
TEST_CASE("test_undef_macros");
CHECK_PP_OUTPUT_EXACT("#define x 1\n"
"x\n"
"#undef x\n"
"x\n"
"#define x 2\n"
"x\n",
"1\nx\n2\n");
}
static void hard_test_define_func_macros(void) {
TEST_CASE("func_macros_hard with pp_01");
CHECK_PP_OUTPUT_EXACT("#define hash_hash # ## #\n"
"#define mkstr(a) # a\n"
"#define in_between(a) mkstr(a)\n"
"#define join(c, d) in_between(c hash_hash d)\n"
"char p[] = join(x, y);\n",
"char p[] = \"x ## y\";\n");
TEST_CASE("func_macros_hard with recursive define");
CHECK_PP_OUTPUT_EXACT("#define M1(x) M2(x + 1)\n"
"#define M2(x) M1(x * 2)\n"
"M1(5)\n",
"M1(5 + 1 * 2)\n");
CHECK_PP_OUTPUT_EXACT("#define A B\n"
"#define B C\n"
"#define C 1\n"
"A\n",
"1\n");
TEST_CASE("func_macros_hard with self recursive call");
CHECK_PP_OUTPUT_EXACT("#define M(x) x\n"
"M(M(10))\n",
"10\n");
CHECK_PP_OUTPUT_EXACT("#define M(x) M(x)\n"
"#define N(x) x\n"
"N(M(1))\n",
"M(1)\n");
TEST_CASE("func_macros_hard with define by macro");
CHECK_PP_OUTPUT_EXACT("#define M1(x) M1(x + 1)\n"
"#define M2 M1\n"
"#define M3(x) x\n"
"M3(M3(M2)(0))\n",
"M1(0 + 1)\n");
TEST_CASE("mulit braces");
CHECK_PP_OUTPUT_EXACT("#define MACRO(a, b, c) a, b, c\n"
"MACRO(1, (2,3), 4)\n",
"1, (2,3), 4\n");
TEST_CASE("max_macro hard");
CHECK_PP_OUTPUT_EXACT("#define max(a, b) ((a) > (b) ? (a) : (b))\n"
"max(1, 2)\n",
"((1) > (2) ? (1) : (2))\n");
CHECK_PP_OUTPUT_EXACT("#define max(a, b) ((a) > (b) ? (a) : (b))\n"
"max(max(x, y), z)\n",
"((((x) > (y) ? (x) : (y))) > (z) ? (((x) > (y) ? "
"(x) : (y))) : (z))\n");
}
static void test_error_cases(void) {
TEST_CASE("macro redefinition");
// 应检测到警告或错误
// CHECK_PP_OUTPUT_CONTAIN("#define A 1\n#define A 2\n", "warning");
TEST_CASE("undefined macro");
CHECK_PP_OUTPUT_EXACT("UNDEFINED_MACRO\n", "UNDEFINED_MACRO\n");
}
static void test_edge_cases(void) {
TEST_CASE("empty macro");
CHECK_PP_OUTPUT_EXACT("#define EMPTY()\nEMPTY()\n", "\n");
TEST_CASE("macro with only spaces");
CHECK_PP_OUTPUT_EXACT("#define SPACE \nSPACE\n", "\n");
TEST_CASE("deep nesting");
CHECK_PP_OUTPUT_EXACT("#define A B\n#define B C\n#define C 1\nA\n", "1\n");
}
static void test_conditional_ifdef(void) {
TEST_CASE("ifdef and ifndef");
// 基本 ifdef / ifndef
CHECK_PP_OUTPUT_EXACT("#define FOO\n"
"#ifdef FOO\n"
"foo\n"
"#endif\n",
"foo\n");
CHECK_PP_OUTPUT_EXACT("#define FOO\n"
"#ifndef FOO\n"
"foo\n"
"#endif\n",
"");
CHECK_PP_OUTPUT_EXACT("#undef FOO\n"
"#ifdef FOO\n"
"foo\n"
"#endif\n",
"");
CHECK_PP_OUTPUT_EXACT("#undef FOO\n"
"#ifndef FOO\n"
"foo\n"
"#endif\n",
"foo\n");
// ifdef + else
CHECK_PP_OUTPUT_EXACT("#define FOO\n"
"#ifdef FOO\n"
"foo\n"
"#else\n"
"bar\n"
"#endif\n",
"foo\n");
CHECK_PP_OUTPUT_EXACT("#undef FOO\n"
"#ifdef FOO\n"
"foo\n"
"#else\n"
"bar\n"
"#endif\n",
"bar\n");
// ifdef + elifdef (C23)
CHECK_PP_OUTPUT_EXACT("#define FOO\n"
"#ifdef FOO\n"
"foo\n"
"#elifdef FOO\n"
"foo2\n"
"#endif\n",
"foo\n");
CHECK_PP_OUTPUT_EXACT("#undef FOO\n"
"#define BAR\n"
"#ifdef FOO\n"
"foo\n"
"#elifdef BAR\n"
"bar\n"
"#else\n"
"none\n"
"#endif\n",
"bar\n");
CHECK_PP_OUTPUT_EXACT("#undef FOO\n"
"#undef BAR\n"
"#ifdef FOO\n"
"foo\n"
"#elifdef BAR\n"
"bar\n"
"#else\n"
"none\n"
"#endif\n",
"none\n");
// 嵌套
CHECK_PP_OUTPUT_EXACT("#define A\n"
"#ifdef A\n"
" #ifdef B\n"
" inner\n"
" #endif\n"
" outer\n"
"#endif\n",
" outer\n");
CHECK_PP_OUTPUT_EXACT("#define B\n"
"#ifndef A\n"
" #ifdef B\n"
" inner\n"
" #endif\n"
" outer\n"
"#endif\n",
" inner\n outer\n");
// 外层假,内层真
CHECK_PP_OUTPUT_EXACT("#ifdef __NONE\n"
"#define OUTER\n"
"#endif\n"
"#ifdef OUTER\n"
"should not appear\n"
"#endif\n",
""); // 期望为空
// 更复杂的嵌套条件
CHECK_PP_OUTPUT_EXACT("#define X\n"
"#ifdef X\n"
"x defined\n"
"#ifdef Y\n"
"Y defined\n"
"#else\n"
"Y not defined\n"
"#endif\n"
"after inner\n"
"#else\n"
"X not defined\n"
"#endif\n",
"x defined\nY not defined\nafter inner\n");
}
static void test_simple_number_conditional_if(void) {
TEST_CASE("if and elif with one integer constants");
// 基本 if
CHECK_PP_OUTPUT_EXACT("#if 1\ntrue\n#endif\n", "true\n");
CHECK_PP_OUTPUT_EXACT("#if 0\nfalse\n#endif\n", "");
CHECK_PP_OUTPUT_EXACT("#if 4\ntrue\n#endif\n", "true\n");
CHECK_PP_OUTPUT_EXACT("#if 0\nfalse\n#else\nother\n#endif\n", "other\n");
// if + elif + else
CHECK_PP_OUTPUT_EXACT("#if 0\nzero\n#elif 1\none\n#else\nother\n#endif\n",
"one\n");
CHECK_PP_OUTPUT_EXACT("#if 0\nzero\n#elif 0\none\n#else\nother\n#endif\n",
"other\n");
CHECK_PP_OUTPUT_EXACT("#if 1\nfirst\n#elif 1\nsecond\n#endif\n", "first\n");
// 嵌套
CHECK_PP_OUTPUT_EXACT(
"#if 1\n #if 0\n inner\n #endif\n outer\n#endif\n", " outer\n");
CHECK_PP_OUTPUT_EXACT("#if 0\n #if 1\n inner\n #endif\n "
"outer\n#else\n alternative\n#endif\n",
" alternative\n");
// 与 #ifdef 混合
CHECK_PP_OUTPUT_EXACT("#define FOO\n"
"#if 1\n"
" #ifdef FOO\n"
" foo\n"
" #endif\n"
" bar\n"
"#endif\n",
" foo\n bar\n");
}
static void test_variadic_macros(void) {
TEST_CASE("variadic macros with __VA_ARGS__");
// 基本可变参数宏
CHECK_PP_OUTPUT_EXACT("#define FOO(x, ...) x __VA_ARGS__\n"
"FOO(1, 2, 3)\n",
"1 2, 3\n");
// 多参数
CHECK_PP_OUTPUT_EXACT("#define SUM(...) (__VA_ARGS__)\n"
"SUM(1, 2, 3)\n",
"(1, 2, 3)\n");
// 与 printf 结合
CHECK_PP_OUTPUT_EXACT("#define DEBUG(fmt, ...) printf(fmt, __VA_ARGS__)\n"
"DEBUG(\"hello\", 1, 2)\n",
"printf(\"hello\", 1, 2)\n");
// 空可变参数
CHECK_PP_OUTPUT_EXACT("#define FOO(x, ...) x __VA_ARGS__\n"
"FOO(1)\n",
"1 \n");
CHECK_PP_OUTPUT_EXACT("#define FOO(x, ...) #__VA_ARGS__\nFOO(1);", "\"\";");
}
static void test_gnu_comma_variadic_deletion(void) {
TEST_CASE("GNU comma deletion with ## and __VA_ARGS__");
// 可变参数为空,逗号被删除
CHECK_PP_OUTPUT_EXACT("#define FOO(fmt, ...) printf(fmt, ## __VA_ARGS__)\n"
"FOO(\"hello\")\n",
"printf(\"hello\")\n");
// 可变参数非空,逗号保留
CHECK_PP_OUTPUT_EXACT("#define FOO(fmt, ...) printf(fmt, ## __VA_ARGS__)\n"
"FOO(\"%d\", 42)\n",
"printf(\"%d\",42)\n");
// 带空白变体
CHECK_PP_OUTPUT_EXACT("#define FOO(fmt,...) printf(fmt,##__VA_ARGS__)\n"
"FOO(\"%d\", 42)\n",
"printf(\"%d\",42)\n");
}
static void test_c99_docs(void) {
TEST_CASE("6.10.3.3 The ## operator EXAMPLE");
CHECK_PP_OUTPUT_EXACT("#define hash_hash # ## #\n"
"#define mkstr(a) # a\n"
"#define in_between(a) mkstr(a)\n"
"#define join(c, d) in_between(c hash_hash d)\n"
"char p[] = join(x, y);\n",
"char p[] = \"x ## y\";\n");
// 6.10.3.5 Scope of macrodefinitions
TEST_CASE("EXAMPLE 3 To illustrate the rules for redefinition and "
"reexamination, the sequence");
/*
CHECK_PP_OUTPUT_EXACT(
"#define x 3\n"
"#define f(a) f(x * (a))\n"
"#undef x\n"
"#define x 2\n"
"#define g f\n"
"#define z z[0]\n"
"#define h g(~\n"
"#define m(a) a(w)\n"
"#define w 0,1\n"
"#define t(a) a\n"
"#define p() int\n"
"#define q(x) x\n"
"#define r(x,y) x ## y\n"
"#define str(x) # x\n"
"f(y+1) + f(f(z)) % t(t(g)(0) + t)(1);\n"
"g(x+(3,4)-w) | h 5) & m\n"
" (f)^m(m);\n"
"p() i[q()] = { q(1), r(2,3), r(4,), r(,5), r(,) };\n"
"char c[2][6] = { str(hello), str() };\n",
"f(2 * (y+1)) + f(2 * (f(2 * (z[0])))) % f(2 * (0)) + t(1);\n"
"f(2 * (2+(3,4)-0,1)) | f(2 * (~ 5)) & f(2 * (0,1))^m(0,1);\n"
"int i[] = { 1, 23, 4, 5, };\n"
"char c[2][6] = { \"hello\", \"\" };\n");
*/
TEST_CASE("EXAMPLE 4 To illustrate the rules for creating character string "
"literals and concatenating tokens, the sequence");
TEST_CASE("EXAMPLE 5 To illustrate the rules for placemarker preprocessing "
"tokens, the sequence");
/*
CHECK_PP_OUTPUT_EXACT("#define t(x,y,z) x ## y ## z\n"
"int j[] = { t(1,2,3), t(,4,5), t(6,,7), t(8,9,),\n"
"\t\t\tt(10,,), t(,11,), t(,,12), t(,,) };\n",
"int j[] = { 123, 45, 67, 89,\n"
"\t\t\t10, 11, 12, };\n");
*/
TEST_CASE("EXAMPLE 6 To demonstrate the redefinition rules, the following "
"sequence is valid.");
CHECK_PP_OUTPUT_EXACT(
"#define OBJ_LIKE (1-1)\n"
"#define OBJ_LIKE /* white space */ (1-1) /* other */\n"
"#define FUNC_LIKE(a) (a)\n"
"#define FUNC_LIKE( a )( /* note the white space */ \\\n"
" a /* other stuffonthis line\n"
" */ )\n",
"");
// INVALID
// CHECK_PP_OUTPUT_EXACT(
// "#define OBJ_LIKE (0) // different token sequence\n"
// "#define OBJ_LIKE (1 - 1) // different white space\n"
// "#define FUNC_LIKE(b) ( a ) // different parameter usage\n"
// "#define FUNC_LIKE(b) ( b ) // different parameter spelling\n");
TEST_CASE("EXAMPLE 7 Finally,to show the variable argument list macro "
"facilities:");
CHECK_PP_OUTPUT_EXACT(
"#define debug(...) fprintf(stderr, __VA_ARGS__)\n"
"#define showlist(...) puts(#__VA_ARGS__)\n"
"#define report(test, ...) ((test)?puts(#test): \\\n"
" printf(__VA_ARGS__))\n"
"debug(\"Flag\");\n"
"debug(\"X = %d\\n\", x);\n"
"showlist(The first, second, and third items.);\n"
"report(x>y, \"x is %d but y is %d\", x, y);\n",
"fprintf(stderr, \"Flag\");\n"
"fprintf(stderr, \"X = %d\\n\", x);\n"
"puts(\"The first, second, and third items.\");\n"
"((x>y)?puts(\"x>y\"): printf(\"x is %d but y is %d\", x, y));\n");
}
#define TEST_LIST_CASE(func_name) {#func_name, func_name}
TEST_LIST = {
TEST_LIST_CASE(test_define_simple_no_macro),
TEST_LIST_CASE(test_define_simple_object_macro),
TEST_LIST_CASE(test_define_complex_object_macro),
TEST_LIST_CASE(test_define_object_macro_backspace),
TEST_LIST_CASE(test_define_function_macro),
TEST_LIST_CASE(test_define_stringify_operator),
TEST_LIST_CASE(test_define_concat_operator),
TEST_LIST_CASE(test_define_nested_macros),
TEST_LIST_CASE(test_undef_macros),
TEST_LIST_CASE(hard_test_define_func_macros),
TEST_LIST_CASE(test_conditional_ifdef),
TEST_LIST_CASE(test_simple_number_conditional_if),
TEST_LIST_CASE(test_variadic_macros),
TEST_LIST_CASE(test_gnu_comma_variadic_deletion),
TEST_LIST_CASE(test_c99_docs),
{NULL, NULL},
};

View File

@@ -66,7 +66,7 @@ static int sstream_scan_at(scc_sstream_t *stream, scc_pos_t scan_pos,
} }
// 环形缓冲区填充回调(通过 userdata 获取流对象) // 环形缓冲区填充回调(通过 userdata 获取流对象)
static int fill_func(scc_sstream_char_t *out, void *userdata) { static cbool fill_func(scc_sstream_char_t *out, void *userdata) {
scc_sstream_t *stream = (scc_sstream_t *)userdata; scc_sstream_t *stream = (scc_sstream_t *)userdata;
if (stream->fill_pos.offset >= stream->len) if (stream->fill_pos.offset >= stream->len)
return false; // 已到文件尾 return false; // 已到文件尾
@@ -84,13 +84,13 @@ int scc_sstream_init(scc_sstream_t *stream, const char *fname, int ring_size) {
scc_file_t file = scc_fopen(fname, SCC_FILE_READ); scc_file_t file = scc_fopen(fname, SCC_FILE_READ);
if (file == null) { if (file == null) {
LOG_ERROR("Failed to open file: %s", fname); LOG_ERROR("Failed to open file: %s", fname);
return 0; return 1;
} }
usize fsize = scc_fsize(file); usize fsize = scc_fsize(file);
if (fsize == 0) { if (fsize == 0) {
LOG_WARN("file size is 0"); LOG_WARN("file size is 0");
scc_fclose(file); scc_fclose(file);
return 0; return 2;
} }
char *buffer = (char *)scc_malloc(fsize); char *buffer = (char *)scc_malloc(fsize);
scc_memset(buffer, 0, fsize); scc_memset(buffer, 0, fsize);

View File

@@ -7,7 +7,12 @@
#define __SCC_LOG_IMPL_H__ #define __SCC_LOG_IMPL_H__
#include "color.h" #include "color.h"
#ifndef __SCC__
#include <stdarg.h> #include <stdarg.h>
#else
// TODO
#warning "TODO: implement stdarg.h"
#endif
#ifdef __SCC_LOG_IMPL_USE_STD_IMPL__ #ifdef __SCC_LOG_IMPL_USE_STD_IMPL__
#include <stdio.h> #include <stdio.h>
@@ -21,7 +26,7 @@
#define __smcc_log_unreachable() (__builtin_unreachable()) #define __smcc_log_unreachable() (__builtin_unreachable())
#elif defined _MSC_VER // MSVC #elif defined _MSC_VER // MSVC
#define __smcc_log_unreachable() (__assume(false)) #define __smcc_log_unreachable() (__assume(false))
#elif defined __SMCC_BUILT_IN__ // The SMCC (my compiler) #elif defined __SCC_BUILT_IN__ // The SCC Compiler (my compiler)
#define __smcc_log_unreachable() (__smcc_builtin_unreachable()) #define __smcc_log_unreachable() (__smcc_builtin_unreachable())
#else #else
#define __smcc_log_unreachable() #define __smcc_log_unreachable()
@@ -182,13 +187,10 @@ void log_set_handler(logger_t *logger, log_handler handler);
* 利用数组大小不能为负的特性 * 利用数组大小不能为负的特性
* 或使用 _Static_assert (C11) * 或使用 _Static_assert (C11)
*/ */
#ifdef static_assert
#undef static_assert
#endif
#if __STDC_VERSION__ >= 201112L #if __STDC_VERSION__ >= 201112L
#define static_assert _Static_assert #define StaticAssert static_assert
#else #else
#define static_assert(cond, msg) extern char __static_assertion[(cond) ? 1 : -1] #define StaticAssert(cond, msg) extern char __static_assertion[(cond) ? 1 : -1]
#endif #endif
#ifdef __SCC_LOG_IMPL_IMPORT_SRC__ #ifdef __SCC_LOG_IMPL_IMPORT_SRC__

View File

@@ -14,7 +14,7 @@
* - head: 已消费的逻辑索引 * - head: 已消费的逻辑索引
* - probe: 预览索引 * - probe: 预览索引
* - tail: 已填充的逻辑末尾索引 * - tail: 已填充的逻辑末尾索引
* - fill: 填充回调函数 (当需要新元素时调用) * - fill: 填充回调函数 (当需要新元素时调用) 返回true表示成功
*/ */
#define SCC_RING(type) \ #define SCC_RING(type) \
struct { \ struct { \
@@ -23,20 +23,26 @@
usize head; \ usize head; \
usize probe; \ usize probe; \
usize tail; \ usize tail; \
int (*fill)(type * out, void *userdata); \ cbool (*fill)(type * out, void *userdata); \
void *userdata; \ void *userdata; \
} }
// ==================== 内部辅助宏 (不直接使用) ==================== // ==================== 内部辅助宏 (不直接使用) ====================
#define scc_ring_phys(ring, idx) ((idx) % (ring).cap) #define _scc_ring_phys(ring, idx) ((idx) % (ring).cap)
#define _scc_ring_cap(ring) ((ring).cap)
#define _scc_ring_head(ring) ((ring).head)
#define _scc_ring_probe(ring) ((ring).probe)
#define _scc_ring_tail(ring) ((ring).tail)
#define _scc_ring_size(ring) ((ring).tail - (ring).head)
#define _scc_ring_empty(ring) ((ring).head == (ring).tail)
/** /**
* @brief 确保 probe 位置有数据可用 (尝试填充) * @brief 确保 probe 位置有数据可用 (尝试填充)
* @param ring 环形缓冲区变量 * @param ring 环形缓冲区变量
* @param ok 变量名 (如 int ok_flag) ,宏会将其设置为 true 或 false * @param ok 变量名 (如 int ok_flag) ,宏会将其设置为 true 或 false
*/ */
#define scc_ring_ensure(ring, ok) \ #define _scc_ring_ensure(ring, ok) \
do { \ do { \
ok = 1; \ ok = 1; \
if ((ring).probe < (ring).tail) \ if ((ring).probe < (ring).tail) \
@@ -50,8 +56,9 @@
ok = 0; /* 缓冲区满,无法填充 */ \ ok = 0; /* 缓冲区满,无法填充 */ \
break; \ break; \
} \ } \
usize phys_tail = scc_ring_phys(ring, (ring).tail); \ usize phys_tail = _scc_ring_phys(ring, (ring).tail); \
if (!(ring).fill(&(ring).data[phys_tail], (ring).userdata)) { \ if ((ring).fill == null || \
!(ring).fill(&(ring).data[phys_tail], (ring).userdata)) { \
ok = 0; \ ok = 0; \
break; \ break; \
} \ } \
@@ -79,6 +86,17 @@
(ring).userdata = (_userdata); \ (ring).userdata = (_userdata); \
} while (0) } while (0)
#define scc_ring_by_buffer(ring, buffer, size) \
do { \
(ring).data = (buffer); \
(ring).cap = (size); \
(ring).head = 0; \
(ring).probe = 0; \
(ring).tail = (size); \
(ring).fill = null; \
(ring).userdata = null; \
} while (0)
/** /**
* @brief 释放环形缓冲区内存 * @brief 释放环形缓冲区内存
* @param ring 环形缓冲区变量 * @param ring 环形缓冲区变量
@@ -98,14 +116,14 @@
*/ */
#define scc_ring_peek(ring, val, ok) \ #define scc_ring_peek(ring, val, ok) \
do { \ do { \
scc_ring_ensure(ring, ok); \ _scc_ring_ensure(ring, ok); \
if (!(ok)) \ if (!(ok)) \
break; \ break; \
if ((ring).probe >= (ring).tail) { \ if ((ring).probe >= (ring).tail) { \
ok = 0; \ ok = 0; \
break; \ break; \
} \ } \
usize _phys = scc_ring_phys(ring, (ring).probe); \ usize _phys = _scc_ring_phys(ring, (ring).probe); \
val = (ring).data[_phys]; \ val = (ring).data[_phys]; \
} while (0) } while (0)
@@ -117,14 +135,14 @@
*/ */
#define scc_ring_next(ring, val, ok) \ #define scc_ring_next(ring, val, ok) \
do { \ do { \
scc_ring_ensure(ring, ok); \ _scc_ring_ensure(ring, ok); \
if (!(ok)) \ if (!(ok)) \
break; \ break; \
if ((ring).probe >= (ring).tail) { \ if ((ring).probe >= (ring).tail) { \
ok = 0; \ ok = 0; \
break; \ break; \
} \ } \
usize _phys = scc_ring_phys(ring, (ring).probe); \ usize _phys = _scc_ring_phys(ring, (ring).probe); \
val = (ring).data[_phys]; \ val = (ring).data[_phys]; \
(ring).probe++; \ (ring).probe++; \
} while (0) } while (0)

View File

@@ -194,6 +194,11 @@ static inline char *scc_cstring_as_cstr(const scc_cstring_t *str) {
return str->data; return str->data;
} }
static inline int scc_cstring_cmp(const scc_cstring_t *str1,
const scc_cstring_t *str2) {
return scc_strcmp(scc_cstring_as_cstr(str1), scc_cstring_as_cstr(str2));
}
static inline char *scc_cstring_move_cstr(scc_cstring_t *str) { static inline char *scc_cstring_move_cstr(scc_cstring_t *str) {
if (str == null || str->data == null) { if (str == null || str->data == null) {
return null; return null;

View File

@@ -1,8 +1,9 @@
#ifndef __SCC_CORE_TYPE_H__ #ifndef __SCC_CORE_TYPE_H__
#define __SCC_CORE_TYPE_H__ #define __SCC_CORE_TYPE_H__
#ifndef __SCC_BUILTIN_TYPE__ #ifndef __SCC__
#include <assert.h> #include <assert.h>
#include <stdarg.h>
#include <stdbool.h> #include <stdbool.h>
#include <stddef.h> #include <stddef.h>
#include <stdint.h> #include <stdint.h>
@@ -32,22 +33,28 @@ typedef bool cbool;
static_assert(sizeof(cbool) == 1, "cbool size must 1"); static_assert(sizeof(cbool) == 1, "cbool size must 1");
#else #else
#define __scc_i8 /* clang-format off */
#define __scc_i16 typedef __scc_i8 i8;
#define __scc_i32 typedef __scc_i16 i16;
#define __scc_i64 typedef __scc_i32 i32;
#define __scc_u8 typedef __scc_i64 i64;
#define __scc_u16 typedef __scc_u8 u8;
#define __scc_u32 typedef __scc_u16 u16;
#define __scc_u64 typedef __scc_u32 u32;
#define __scc_f32 typedef __scc_u64 u64;
#define __scc_f64
#define __scc_bool typedef __scc_isize isize;
#define __scc_char typedef __scc_usize usize;
#define __scc_void typedef __scc_isize pdiff;
#define __scc_null
#define __scc_isize typedef __scc_f32 f32;
#define __scc_usize typedef __scc_f64 f64;
typedef __scc_bool cbool;
/// void / null
#define null __scc_null
/* clang-format on */
#endif #endif
typedef union scc_cvalue { typedef union scc_cvalue {

View File

@@ -60,7 +60,7 @@ void test_char_ring_basic(void) {
reset_char_fill(); reset_char_fill();
char_ring_t ring; char_ring_t ring;
scc_ring_init(ring, 4, char_fill, 0); scc_ring_init(ring, 4, char_fill, 0);
char c; char c = 0;
cbool ok; cbool ok;
scc_ring_next(ring, c, ok); scc_ring_next(ring, c, ok);
@@ -109,7 +109,7 @@ void test_char_ring_full(void) {
reset_char_fill(); reset_char_fill();
char_ring_t ring; char_ring_t ring;
scc_ring_init(ring, 3, char_fill, 0); scc_ring_init(ring, 3, char_fill, 0);
char c; char c = 0;
cbool ok; cbool ok;
scc_ring_next(ring, c, ok); scc_ring_next(ring, c, ok);
@@ -140,7 +140,7 @@ void test_char_ring_eof(void) {
reset_char_fill(); reset_char_fill();
char_ring_t ring; char_ring_t ring;
scc_ring_init(ring, 32, char_fill, 0); scc_ring_init(ring, 32, char_fill, 0);
char c; char c = 0;
cbool ok; cbool ok;
for (int i = 0; i < 26; i++) { for (int i = 0; i < 26; i++) {
@@ -160,7 +160,7 @@ void test_char_ring_back_boundary(void) {
reset_char_fill(); reset_char_fill();
char_ring_t ring; char_ring_t ring;
scc_ring_init(ring, 4, char_fill, 0); scc_ring_init(ring, 4, char_fill, 0);
char c; char c = 0;
cbool ok; cbool ok;
scc_ring_next(ring, c, ok); scc_ring_next(ring, c, ok);
@@ -186,7 +186,7 @@ void test_char_ring_consume_reset(void) {
reset_char_fill(); reset_char_fill();
char_ring_t ring; char_ring_t ring;
scc_ring_init(ring, 5, char_fill, 0); scc_ring_init(ring, 5, char_fill, 0);
char c; char c = 0;
cbool ok; cbool ok;
scc_ring_next(ring, c, ok); scc_ring_next(ring, c, ok);
@@ -219,7 +219,7 @@ void test_char_ring_wrap(void) {
reset_char_fill(); reset_char_fill();
char_ring_t ring; char_ring_t ring;
scc_ring_init(ring, 3, char_fill, 0); scc_ring_init(ring, 3, char_fill, 0);
char c; char c = 0;
cbool ok; cbool ok;
for (int i = 0; i < 26; i++) { for (int i = 0; i < 26; i++) {
@@ -239,7 +239,7 @@ void test_token_ring_basic(void) {
reset_token_fill(); reset_token_fill();
token_ring_t ring; token_ring_t ring;
scc_ring_init(ring, 3, token_fill, 0); scc_ring_init(ring, 3, token_fill, 0);
test_token_t tok; test_token_t tok = {0};
cbool ok; cbool ok;
scc_ring_next(ring, tok, ok); scc_ring_next(ring, tok, ok);
@@ -284,7 +284,7 @@ void test_token_ring_full(void) {
reset_token_fill(); reset_token_fill();
token_ring_t ring; token_ring_t ring;
scc_ring_init(ring, 2, token_fill, 0); scc_ring_init(ring, 2, token_fill, 0);
test_token_t tok; test_token_t tok = {0};
cbool ok; cbool ok;
scc_ring_next(ring, tok, ok); scc_ring_next(ring, tok, ok);

View File

@@ -1,41 +1,21 @@
#include <argparse.h> #include <argparse.h>
#include <lexer.h> #include <scc_lexer.h>
#include <parser.h> #include <scc_pproc.h>
#include <pprocessor.h>
#include <ast_dump.h> // #include <scc_parser.h>
#include <ir_dump.h> // #include <ast_dump.h>
#include <scc_ast2ir.h> // #include <ir_dump.h>
// #include <scc_ast2ir.h>
#include <stdio.h>
static scc_probe_stream_t *from_file_stream(FILE *fp) {
if (fseek(fp, 0, SEEK_END) != 0) {
perror("fseek failed");
return NULL;
}
usize fsize = ftell(fp);
if (fseek(fp, 0, SEEK_SET)) {
perror("fseek failed");
return NULL;
}
char *buffer = (char *)scc_malloc(fsize);
scc_memset(buffer, 0, fsize);
usize read_ret = fread(buffer, 1, fsize, fp);
fclose(fp);
scc_probe_stream_t *stream =
scc_mem_probe_stream_alloc(buffer, read_ret, true);
return stream;
}
typedef struct { typedef struct {
const char *input_file; const char *input_file;
const char *output_file; const char *output_file;
int verbose; int verbose;
cbool dump_ast; scc_argparse_list_t include_paths;
cbool dump_ir; cbool emit_lex;
cbool emit_pp;
cbool emit_ast;
cbool emit_ir;
} scc_config_t; } scc_config_t;
static void setup_argparse(scc_argparse_t *argparse, scc_config_t *config, static void setup_argparse(scc_argparse_t *argparse, scc_config_t *config,
@@ -45,17 +25,23 @@ static void setup_argparse(scc_argparse_t *argparse, scc_config_t *config,
SCC_HINT_DESCRIPTION, SCC_HINT_DESCRIPTION,
SCC_HINT_OUTPUT_FILE, SCC_HINT_OUTPUT_FILE,
SCC_HINT_INPUT_FILE, SCC_HINT_INPUT_FILE,
SCC_HINT_INCLUDE_PATH,
SCC_HINT_VERBOSE, SCC_HINT_VERBOSE,
SCC_HINT_EMIT_LEX,
SCC_HINT_EMIT_PP,
SCC_HINT_EMIT_AST, SCC_HINT_EMIT_AST,
SCC_HINT_EMIT_IR, SCC_HINT_EMIT_IR,
}; };
static const char *scc_hints_en[] = { static const char *scc_hints_en[] = {
[SCC_HINT_PROG_NAME] = "scc", [SCC_HINT_PROG_NAME] = "scc",
[SCC_HINT_DESCRIPTION] = "A simple C compiler", [SCC_HINT_DESCRIPTION] = "A simple C compiler",
[SCC_HINT_OUTPUT_FILE] = "Output file", [SCC_HINT_OUTPUT_FILE] = "Output file",
[SCC_HINT_INPUT_FILE] = "Input source file", [SCC_HINT_INPUT_FILE] = "Input source file",
[SCC_HINT_INCLUDE_PATH] = "SCC_HINT_INCLUDE_PATH",
[SCC_HINT_VERBOSE] = "Increase verbosity (can be used multiple times)", [SCC_HINT_VERBOSE] = "Increase verbosity (can be used multiple times)",
[SCC_HINT_EMIT_LEX] = "Generate lexer sources tokens and exit",
[SCC_HINT_EMIT_PP] = "Generate preprocessed tokens and exit",
[SCC_HINT_EMIT_AST] = "Generate AST and exit", [SCC_HINT_EMIT_AST] = "Generate AST and exit",
[SCC_HINT_EMIT_IR] = "Generate IR and exit", [SCC_HINT_EMIT_IR] = "Generate IR and exit",
}; };
@@ -64,9 +50,12 @@ static void setup_argparse(scc_argparse_t *argparse, scc_config_t *config,
[SCC_HINT_DESCRIPTION] = "一个简单的C编译器", [SCC_HINT_DESCRIPTION] = "一个简单的C编译器",
[SCC_HINT_OUTPUT_FILE] = "输出文件", [SCC_HINT_OUTPUT_FILE] = "输出文件",
[SCC_HINT_INPUT_FILE] = "输入源文件", [SCC_HINT_INPUT_FILE] = "输入源文件",
[SCC_HINT_INCLUDE_PATH] = "SCC_HINT_INCLUDE_PATH",
[SCC_HINT_VERBOSE] = "增加详细输出(可多次使用)", [SCC_HINT_VERBOSE] = "增加详细输出(可多次使用)",
[SCC_HINT_EMIT_AST] = "生成 AST 并退出", [SCC_HINT_EMIT_LEX] = "生成`源代码的词法单元`并退出",
[SCC_HINT_EMIT_IR] = "生成 IR 并退出", [SCC_HINT_EMIT_PP] = "生成`预处理后的词法单元`并退出",
[SCC_HINT_EMIT_AST] = "生成`抽象语法树`并退出",
[SCC_HINT_EMIT_IR] = "生成`中间代码`并退出",
}; };
const char **scc_hints; const char **scc_hints;
@@ -101,24 +90,44 @@ static void setup_argparse(scc_argparse_t *argparse, scc_config_t *config,
scc_argparse_spec_set_required(&arg_input.spec, true); scc_argparse_spec_set_required(&arg_input.spec, true);
scc_argparse_cmd_add_arg(root, &arg_input); scc_argparse_cmd_add_arg(root, &arg_input);
// -I, --include (添加额外的系统头文件搜索路径)
scc_argparse_opt_t opt_include;
scc_argparse_opt_init(&opt_include, 'I', "include",
scc_hints[SCC_HINT_INCLUDE_PATH]);
scc_argparse_spec_setup_list(&opt_include.spec, &(config->include_paths));
scc_argparse_cmd_add_opt(root, &opt_include);
// -v, --verbose (计数) // -v, --verbose (计数)
scc_argparse_opt_t opt_verbose; scc_argparse_opt_t opt_verbose;
scc_argparse_opt_init(&opt_verbose, 'v', "verbose", scc_argparse_opt_init(&opt_verbose, 'V', "verbose",
scc_hints[SCC_HINT_VERBOSE]); scc_hints[SCC_HINT_VERBOSE]);
scc_argparse_spec_setup_count(&opt_verbose.spec, &(config->verbose)); scc_argparse_spec_setup_count(&opt_verbose.spec, &(config->verbose));
scc_argparse_cmd_add_opt(root, &opt_verbose); scc_argparse_cmd_add_opt(root, &opt_verbose);
// -T, --ast // --emit-lex
scc_argparse_opt_t opt_lex;
scc_argparse_opt_init(&opt_lex, 0, "emit-lex",
scc_hints[SCC_HINT_EMIT_LEX]);
scc_argparse_spec_setup_bool(&opt_lex.spec, &(config->emit_lex));
scc_argparse_cmd_add_opt(root, &opt_lex);
// --emit-pp
scc_argparse_opt_t opt_pp;
scc_argparse_opt_init(&opt_pp, 0, "emit-pp", scc_hints[SCC_HINT_EMIT_PP]);
scc_argparse_spec_setup_bool(&opt_pp.spec, &(config->emit_pp));
scc_argparse_cmd_add_opt(root, &opt_pp);
// -T, --emit-ast
scc_argparse_opt_t opt_ast; scc_argparse_opt_t opt_ast;
scc_argparse_opt_init(&opt_ast, 'T', "emit-ast", scc_argparse_opt_init(&opt_ast, 'T', "emit-ast",
scc_hints[SCC_HINT_EMIT_AST]); scc_hints[SCC_HINT_EMIT_AST]);
scc_argparse_spec_setup_bool(&opt_ast.spec, &(config->dump_ast)); scc_argparse_spec_setup_bool(&opt_ast.spec, &(config->emit_ast));
scc_argparse_cmd_add_opt(root, &opt_ast); scc_argparse_cmd_add_opt(root, &opt_ast);
// -R, --ir // -R, --emit-ir
scc_argparse_opt_t opt_ir; scc_argparse_opt_t opt_ir;
scc_argparse_opt_init(&opt_ir, 'R', "emit-ir", scc_hints[SCC_HINT_EMIT_IR]); scc_argparse_opt_init(&opt_ir, 'R', "emit-ir", scc_hints[SCC_HINT_EMIT_IR]);
scc_argparse_spec_setup_bool(&opt_ir.spec, &(config->dump_ir)); scc_argparse_spec_setup_bool(&opt_ir.spec, &(config->emit_ir));
scc_argparse_cmd_add_opt(root, &opt_ir); scc_argparse_cmd_add_opt(root, &opt_ir);
} }
@@ -127,18 +136,70 @@ static void setup_argparse(scc_argparse_t *argparse, scc_config_t *config,
#include <windows.h> #include <windows.h>
#endif #endif
static void print_ring(scc_lexer_tok_ring_t *ring, int verbose) {
scc_lexer_tok_t tok = {0};
int ret = 0;
while (1) {
scc_ring_next_consume(*ring, tok, ret);
if (ret == false || tok.type == SCC_TOK_EOF) {
break;
}
if (verbose == 0) {
scc_printf("%s ", scc_get_tok_name(tok.type));
} else if (verbose >= 1) {
scc_printf("token [%-8s] `%s` at %s:%d:%d\n",
scc_get_tok_name(tok.type),
scc_cstring_as_cstr(&tok.lexeme), tok.loc.name,
tok.loc.line, tok.loc.col);
}
scc_lexer_tok_drop(&tok);
}
}
static void print_file(scc_lexer_tok_ring_t *ring, const char *file_name) {
scc_lexer_tok_t tok = {0};
int ret = 0;
scc_file_t fp = scc_fopen(file_name, SCC_FILE_WRITE);
if (fp == null) {
LOG_FATAL("Failed to open file %s", file_name);
return;
}
while (1) {
scc_ring_next_consume(*ring, tok, ret);
if (ret == false || tok.type == SCC_TOK_EOF) {
break;
}
usize ret = scc_fwrite(fp, scc_cstring_as_cstr(&tok.lexeme),
scc_cstring_len(&tok.lexeme));
if (ret != scc_cstring_len(&tok.lexeme)) {
LOG_FATAL("Failed to write to file %s", file_name);
}
scc_lexer_tok_drop(&tok);
}
scc_fclose(fp);
}
int main(int argc, const char **argv, const char **envp) { int main(int argc, const char **argv, const char **envp) {
#ifdef _WIN32 #ifdef _WIN32
SetConsoleOutputCP(CP_UTF8); SetConsoleOutputCP(CP_UTF8);
SetConsoleCP(CP_UTF8); SetConsoleCP(CP_UTF8);
#endif #endif
#ifdef _WIN32
#define OUTPUT_DEFAULT_FILE "a.exe"
#else
#define OUTPUT_DEFAULT_FILE "a.out"
#endif
scc_config_t config = { scc_config_t config = {
.input_file = NULL, .input_file = null,
.output_file = "a.exe",
.verbose = 0, .verbose = 0,
.dump_ast = false, .output_file = null,
.dump_ir = false, .emit_ast = false,
.emit_ir = false,
}; };
scc_vec_init(config.include_paths);
scc_argparse_t argparse; scc_argparse_t argparse;
setup_argparse(&argparse, &config, SCC_ARGPARSE_LANG_ZH); setup_argparse(&argparse, &config, SCC_ARGPARSE_LANG_ZH);
int ret = scc_argparse_parse(&argparse, argc, argv); int ret = scc_argparse_parse(&argparse, argc, argv);
@@ -148,51 +209,76 @@ int main(int argc, const char **argv, const char **envp) {
} }
scc_argparse_drop(&argparse); scc_argparse_drop(&argparse);
setbuf(stdout, NULL); scc_sstream_t sstream;
FILE *fp = fopen(config.input_file, "r"); if (scc_sstream_init(&sstream, config.input_file, 1024)) {
if (!fp) { return 0;
perror("fopen"); }
scc_argparse_drop(&argparse);
return 1; scc_lexer_t lexer;
scc_lexer_init(&lexer, scc_sstream_to_ring(&sstream));
if (config.emit_lex) {
scc_lexer_tok_ring_t *tok_ring = scc_lexer_to_ring(
&lexer, 8, config.output_file == null ? false : true);
if (config.output_file == null) {
print_ring(tok_ring, config.verbose);
} else {
print_file(tok_ring, config.output_file);
}
return 0;
} }
scc_pproc_t pproc; scc_pproc_t pproc;
scc_probe_stream_t *source_code_stream = from_file_stream(fp); scc_pproc_init(&pproc, scc_lexer_to_ring(&lexer, 8, true));
// scc_probe_stream_t *pprocessed_code_stream = scc_vec_foreach(config.include_paths, i) {
// scc_pproc_init(&pproc, source_code_stream); scc_pproc_add_include_path_cstr(&pproc,
scc_vec_at(config.include_paths, i));
scc_lexer_t lexer; }
scc_lexer_init(&lexer, source_code_stream); scc_lexer_tok_vec_t pproc_tok_vec;
scc_lexer_stream_t lexer_stream; scc_vec_init(pproc_tok_vec);
scc_lexer_to_stream(&lexer, &lexer_stream, false); scc_cstring_t pproc_macro_name = scc_cstring_from_cstr("__SCC__");
scc_pproc_add_object_macro(&(pproc.macro_table), &pproc_macro_name,
scc_parser_t parser; &pproc_tok_vec);
scc_parser_init(&parser, &lexer_stream, null); if (config.emit_pp) {
scc_ast_translation_unit_t *translation_unit = scc_lexer_tok_ring_t *tok_ring = scc_pproc_to_ring(&pproc, 8);
scc_parse_translation_unit(&parser); if (config.output_file == null) {
print_ring(tok_ring, config.verbose);
if (config.dump_ast) { } else {
scc_tree_dump_ctx_t tree_dump; print_file(tok_ring, config.output_file);
scc_tree_dump_ctx_init(&tree_dump, true); }
scc_ast_dump_node(&tree_dump, (scc_ast_node_t *)translation_unit);
scc_tree_dump_ctx_drop(&tree_dump);
return 0; return 0;
} }
scc_ir_builder_t ir_builder; scc_pproc_drop(&pproc);
scc_ast2ir(translation_unit, &ir_builder); scc_lexer_drop(&lexer);
scc_sstream_drop(&sstream);
if (config.dump_ir) { // scc_parser_t parser;
scc_ir_dump_ctx_t ir_dump_ctx; // scc_parser_init(&parser, &lexer_stream, null);
scc_tree_dump_ctx_t tree_dump; // 仅为 ir dump 辅助 // scc_ast_translation_unit_t *translation_unit =
scc_tree_dump_ctx_init(&tree_dump, true); // scc_parse_translation_unit(&parser);
scc_ir_dump_ctx_init(&ir_dump_ctx, &tree_dump, &ir_builder.cprog,
&ir_builder.ctx); // if (config.emit_ast) {
// scc_ir_dump_cprog(&ir_dump_ctx); // scc_tree_dump_ctx_t tree_dump;
scc_ir_dump_cprog_linear(&ir_dump_ctx); // scc_tree_dump_ctx_init(&tree_dump, true);
scc_tree_dump_ctx_drop(&tree_dump); // scc_ast_dump_node(&tree_dump, (scc_ast_node_t *)translation_unit);
return 0; // scc_tree_dump_ctx_drop(&tree_dump);
} // return 0;
// }
// scc_ir_builder_t ir_builder;
// scc_ast2ir(translation_unit, &ir_builder);
// if (config.emit_ir) {
// scc_ir_dump_ctx_t ir_dump_ctx;
// scc_tree_dump_ctx_t tree_dump; // 仅为 ir dump 辅助
// scc_tree_dump_ctx_init(&tree_dump, true);
// scc_ir_dump_ctx_init(&ir_dump_ctx, &tree_dump, &ir_builder.cprog,
// &ir_builder.ctx);
// // scc_ir_dump_cprog(&ir_dump_ctx);
// scc_ir_dump_cprog_linear(&ir_dump_ctx);
// scc_tree_dump_ctx_drop(&tree_dump);
// return 0;
// }
scc_printf("output exe at %s", config.output_file); scc_printf("output exe at %s", config.output_file);
return 0; return 0;

View File

@@ -1,4 +1,4 @@
"""cbuild.py - 优化的轻量C构建系统""" """cbuild.py - 优化的轻量C构建系统""" # pylint: disable=too-many-lines
from abc import ABC, abstractmethod from abc import ABC, abstractmethod
import tomllib import tomllib
@@ -517,7 +517,7 @@ class BuildCache:
with open(self.cache_file, "r", encoding="utf-8") as f: with open(self.cache_file, "r", encoding="utf-8") as f:
data = json.load(f) data = json.load(f)
self.cache = {k: CacheEntry(**v) for k, v in data.items()} self.cache = {k: CacheEntry(**v) for k, v in data.items()}
except OSError, json.JSONDecodeError, TypeError: except (OSError, json.JSONDecodeError, TypeError):
self.cache = {} self.cache = {}
def save(self): def save(self):
@@ -564,6 +564,7 @@ class Compiler(ABC):
def __init__(self): def __init__(self):
self.recorded = [] self.recorded = []
self.recording = False self.recording = False
self.dry_run = False
def enable_recording(self, enable=True): def enable_recording(self, enable=True):
"""启用命令记录""" """启用命令记录"""
@@ -580,6 +581,8 @@ class Compiler(ABC):
"""运行命令""" """运行命令"""
self.record(cmd) self.record(cmd)
logger.debug("执行命令: %s", cmd) logger.debug("执行命令: %s", cmd)
if self.dry_run:
return # 只打印,不执行
try: try:
subprocess.run(cmd, check=True) subprocess.run(cmd, check=True)
except subprocess.CalledProcessError as e: except subprocess.CalledProcessError as e:
@@ -679,6 +682,25 @@ class ClangCompiler(Compiler):
self.run(cmd) self.run(cmd)
class SccCompiler(Compiler):
"""SCC编译器"""
def get_flags(self, mode: BuildMode) -> list[str]:
return []
def compile(
self, source: Path, output: Path, includes: list[Path], flags: list[str]
):
# cmd = ["clang"] + flags + ["-c", str(source), "-o", str(output)]
cmd = ["scc", "--emit-pp", "-o", str(output), str(source), "-I", "scc_libs"]
for inc in includes:
cmd += ["-I", f"{inc}"]
self.run(cmd)
def link(self, objects: list[Path], output: Path, flags: list[str]):
pass
class DummyCompiler(Compiler): class DummyCompiler(Compiler):
"""虚拟编译器(用于测试)""" """虚拟编译器(用于测试)"""
@@ -1018,12 +1040,18 @@ def create_parser():
parser = argparse.ArgumentParser(description="轻量C构建系统", prog="cbuild") parser = argparse.ArgumentParser(description="轻量C构建系统", prog="cbuild")
parser.add_argument("--verbose", "-v", action="store_true", help="详细输出") parser.add_argument("--verbose", "-v", action="store_true", help="详细输出")
parser.add_argument("--path", "-p", default=".", help="项目路径") parser.add_argument("--path", "-p", default=".", help="项目路径")
subparsers = parser.add_subparsers(dest="command", required=True, metavar="COMMAND") subparsers = parser.add_subparsers(dest="command", required=True, metavar="COMMAND")
def add_common_args(subparser): def add_common_args(subparser):
subparser.add_argument( subparser.add_argument(
"--compiler", "-c", choices=["gcc", "clang"], default="gcc", help="编译器" "--compiler",
"-c",
choices=["gcc", "clang", "scc"],
default="gcc",
help="编译器",
)
subparser.add_argument(
"--dry-run", "-d", action="store_true", help="仅打印命令,不实际执行"
) )
subparser.add_argument("--record", "-r", action="store_true", help="记录命令") subparser.add_argument("--record", "-r", action="store_true", help="记录命令")
subparser.add_argument( subparser.add_argument(
@@ -1102,6 +1130,7 @@ def create_parser():
def main(): def main():
"""主函数""" """主函数"""
# print("current cwd: " + os.getcwd())
parser = create_parser() parser = create_parser()
args = parser.parse_args() args = parser.parse_args()
@@ -1117,9 +1146,12 @@ def main():
compiler_map = { compiler_map = {
"gcc": GccCompiler(), "gcc": GccCompiler(),
"clang": ClangCompiler(), "clang": ClangCompiler(),
"scc": SccCompiler(),
} }
compiler = compiler_map.get(args.compiler, GccCompiler()) compiler = compiler_map.get(args.compiler, GccCompiler())
if hasattr(args, "dry_run") and args.dry_run:
compiler.dry_run = True
if hasattr(args, "record") and args.record: if hasattr(args, "record") and args.record:
compiler.enable_recording() compiler.enable_recording()

View File

@@ -1,53 +0,0 @@
"""统计目录下C/C++文件的行数(write by AI)"""
import os
def count_lines(file_path):
"""统计单个文件的代码行数"""
try:
with open(file_path, "rb") as f: # 二进制模式读取避免编码问题
return sum(1 for _ in f)
except UnicodeDecodeError:
print(f"警告:无法解码文件 {file_path}(可能不是文本文件)")
return 0
except Exception as e:
print(f"读取 {file_path} 出错: {str(e)}")
return 0
def scan_files(directory, exclude_dirs=None):
"""扫描目录获取所有C/C++文件"""
if exclude_dirs is None:
exclude_dirs = [".git", "venv", "__pycache__", ".old"] # 默认排除的目录
c_files = []
for root, dirs, files in os.walk(directory):
# 跳过排除目录
dirs[:] = [d for d in dirs if d not in exclude_dirs]
for file in files:
if file.endswith((".c", ".h")):
full_path = os.path.join(root, file)
c_files.append(full_path)
return c_files
def main():
"""main function"""
target_dir = input("请输入要扫描的目录路径(留空为当前目录): ") or "."
files = scan_files(target_dir)
total_lines = 0
print("\n统计结果:")
for idx, file in enumerate(files, 1):
lines = count_lines(file)
total_lines += lines
print(f"{idx:4d}. {file} ({lines} 行)")
print(f"\n总计: {len(files)} 个C/C++文件,共 {total_lines} 行代码")
if __name__ == "__main__":
main()