Compare commits

..

10 Commits

Author SHA1 Message Date
zzy
b753ae0911 refactor(lex_parser): 重命名libcore为scc_core并重构头文件包含
- 将依赖项从libcore重命名为scc_core
- 更新头文件包含路径从<libcore.h>到<scc_core.h>
- 保持原有功能不变

refactor(lexer): 重命名libcore为scc_core并添加词法流式解析功能

- 将依赖项从libcore重命名为scc_core
- 移除不再需要的scc_lexer_token结构体定义
- 重命名struct cc_lexer为struct scc_lexer
- 添加scc_lexer_stream_t流式解析器相关定义和实现
- 新增lexer_stream.c文件实现流式token缓冲功能

refactor(lexer_log): 重命名logger变量和头文件定义

- 将头文件保护宏从__SMCC_LEXER_LOG_H__改为__SCC_LEXER_LOG_H__
- 将logger变量从__smcc_lexer_log改为__scc_lexer_log
- 更新头文件包含从<libcore.h>到<scc_core.h>

refactor(lexer_token): 重新组织token头文件结构

- 将头文件保护宏从__SMCC_CC_TOKEN_H__改为__SCC_LEXER_TOKEN_H__
- 更新头文件包含从<libcore.h>到<scc_core.h>
- 将scc_lexer_token结构体定义移至该文件

refactor(lexer): 简化token匹配代码格式

- 移除LCC相关的注释内容
- 优化括号符号的token匹配代码格式,使用clang-format控制

refactor(pprocessor): 更新依赖项名称和头文件包含

- 将libcore重命名为scc_core
- 将libutils重命名为scc_utils
- 更新头文件包含路径

refactor(runtime): 重命名libcore为scc_core并重构目录结构

- 将libcore目录重命名为scc_core
- 将libutils目录重命名为scc_utils
- 更新所有相关的头文件包含路径
- 修改cbuild.toml中的包名称
- 更新core_vec.h中的宏定义以支持标准库模式
2026-01-08 11:22:27 +08:00
zzy
09f4ac8de0 feat(lex_parser, pprocessor): replace consume with next and remove stream resets
- Replace `scc_probe_stream_consume` with `scc_probe_stream_next` for consistent stream advancement
- Remove redundant `scc_probe_stream_reset` calls before peeking, as `next` and `peek` handle state
- Update `scc_cstring_new` to `scc_cstring_create` and `scc_pos_init` to `scc_pos_create` for naming consistency
- Change `scc_pp_macro_get` parameter to `const scc_cstring_t*` for better const-correctness
- Improves code clarity and maintains proper stream position tracking
2025-12-28 10:49:29 +08:00
zzy
07f5d9331b feat(lexer, preprocessor): replace cstring conversion with copy and refactor macro expansion
- Replace `scc_cstring_from_cstr(scc_cstring_as_cstr(...))` with `scc_cstring_copy()` in lexer to fix memory leaks
- Extract macro expansion logic into separate `expand_macro.c` file
- Remove `expand_stack` parameter from `scc_pp_expand_macro()` function
- Add new parsing functions for macro replacement lists and arguments
- Add string utility functions for whitespace trimming and string joining
- Update memory stream documentation for clarity
2025-12-15 20:24:39 +08:00
zzy
73d74f5e13 refactor(pprocessor): rename macro table type and update function names
- Change `scc_macro_table_t` to `scc_pp_macro_table_t` for consistency
- Rename `scc_pp_macro_create` to `scc_pp_macro_new` for naming convention
- Remove unused `scc_pp_compress_whitespace` function
- Update macro table function names: `scc_pp_find_macro` → `scc_pp_macro_table_get`, `scc_pp_remove_macro` → `scc_pp_macro_table_remove`
- Add new `scc_pp_macro_table_set` function for setting macros
- Update all function signatures to use new type name
- Remove commented-out whitespace compression code from implementation
2025-12-14 12:59:03 +08:00
zzy
ce8031b21f feat(parse): implement # and ## operator handling in macro expansion
- Add support for # (stringify) and ## (concatenation) operators in macro replacement lists
- Implement scc_pp_expand_string_unsafe() to process operator tokens during macro expansion
- Add helper macros to identify blank, stringify, and concatenation tokens in replacement lists
- Include missing headers (ctype.h, string.h) for character handling functions
- Update object macro expansion to use new string expansion function instead of simple concatenation
- Improve whitespace handling in macro replacement parsing to prevent interference with operator processing
2025-12-13 18:29:21 +08:00
zzy
07a76d82f4 feat(lex_parser, pprocessor): rename identifier header check and add macro system
- Rename `scc_lex_parse_is_identifier_header` to `scc_lex_parse_is_identifier_prefix` for clarity and add a TODO comment
- Update lexer to use the renamed function for consistency
- Fix package and dependency names in `cbuild.toml` (`smcc_pprocesser` → `scc_pprocesser`, `smcc_lex_parser` → `lex_parser`)
- Introduce new macro system with header file `pp_macro.h` defining macro types, structures, and management functions
- Refactor preprocessor initialization and cleanup in `pprocessor.c` to use new macro table and stream handling
- Replace legacy `hashmap` with `scc_pp_macro_table_t` for macro storage
- Improve error handling and resource management in preprocessor lifecycle
2025-12-13 16:09:46 +08:00
zzy
874a58281f feat(lex_parser): rename functions and update header guard prefix
- Change header guard from `__SMCC_LEX_PARSER_H__` to `__SCC_LEX_PARSER_H__`
- Prefix all lexer functions with `scc_` (e.g., `lex_parse_char` → `scc_lex_parse_char`)
- Add new helper function `scc_lex_parse_is_identifier_header`
- Update references in source and test files to use new function names
- Replace `core_stream_eof` with `scc_stream_eof` for consistency
2025-12-13 14:06:13 +08:00
zzy
94d3f46fac refactor(lex_parser): replace core_pos_* functions with scc_pos_* equivalents
Update all position tracking calls in lex_parser.c to use the scc_pos_* function family (scc_pos_next, scc_pos_next_line) instead of the deprecated core_pos_* variants. This ensures consistency with the scc naming convention and prepares for future removal of the old core functions.
2025-12-12 21:33:51 +08:00
zzy
897ef449fb feat: add atomic profile update flag and update clean command
- Add `-fprofile-update=atomic` flag to both GCC and Clang compiler configurations for thread-safe coverage data updates
- Change `--all` argument in clean command to default to True and add `-a` short option for consistency with common CLI patterns
2025-12-11 17:30:33 +08:00
zzy
3aaf3a3991 feat: add SCF format library and rename components to SCC prefix
- Introduce new SCF (SCC Format) library with header, implementation, and test files
- SCF is a minimal executable/linkable format focused on internal linking with external symbol import/export abstraction
- Rename lexer and lex_parser packages from 'smcc_' to 'scc_' prefix for consistency
- Update hashmap implementation to use 'scc_' prefix for types and structures
- Add build configuration for new format library with dependencies on libcore and libutils
2025-12-11 17:29:12 +08:00
70 changed files with 3264 additions and 1050 deletions

0
libs/format/README.md Normal file
View File

8
libs/format/cbuild.toml Normal file
View File

@@ -0,0 +1,8 @@
[package]
name = "scc_format"
version = "0.1.0"
dependencies = [
{ name = "libcore", path = "../../runtime/libcore" },
{ name = "libutils", path = "../../runtime/libutils" },
]

150
libs/format/include/scf.h Normal file
View File

@@ -0,0 +1,150 @@
/**
* @file scf.h
* @brief scc format (SMF) 头文件
*
* SCF是一个极简的可执行可链接文件格式专注于内部链接处理
* 同时提供外部符号导入/导出的抽象接口。
*/
#ifndef __SCC_FORMAT_H__
#define __SCC_FORMAT_H__
#include <stddef.h>
#include <stdint.h>
#define scf_byte_t uint8_t
#define scf_enum_t uint32_t
#define scf_size_t uint32_t
#ifdef __cplusplus
extern "C" {
#endif
/** SCF魔数 */
#define SCF_MAGIC "SCF\0"
/** SCF版本号 */
#define SCF_VERSION 1
/** 架构类型 */
typedef enum {
SCF_ARCH_UNKNOWN = 0,
SCF_ARCH_RV32 = 1,
SCF_ARCH_RV64 = 2,
SCF_ARCH_X86 = 3,
SCF_ARCH_X64 = 4,
} scf_arch_t;
/** 文件标志位 */
typedef enum {
SCF_FLAG_EXECUTABLE = 0x01, // 可执行文件
SCF_FLAG_RELOCATABLE = 0x02, // 可重定位文件
SCF_FLAG_EXE_RELOC =
SCF_FLAG_EXECUTABLE | SCF_FLAG_RELOCATABLE, // 内部链接后的可执行文件
} scf_flags_t;
/** 符号类型 */
typedef enum {
SCF_SYM_TYPE_UNDEF = 0, // 未定义
SCF_SYM_TYPE_FUNC = 1, // 函数
SCF_SYM_TYPE_DATA = 2, // 数据
SCF_SYM_TYPE_OBJECT = 3, // 对象
} scf_sym_type_t;
/** 符号绑定类型 */
typedef enum {
SCF_SYM_BIND_LOCAL = 0, // 局部符号
SCF_SYM_BIND_GLOBAL = 1, // 全局符号
SCF_SYM_BIND_WEAK = 2, // 弱引用
} scf_sym_bind_t;
/** 符号可见性 */
typedef enum {
SCF_SYM_VIS_DEFAULT = 0, // 默认可见性
SCF_SYM_VIS_HIDDEN = 1, // 隐藏
SCF_SYM_VIS_PROTECTED = 2, // 受保护
} scf_sym_vis_t;
/** 段类型 */
typedef enum {
SCF_SECT_NONE = 0, // 无
SCF_SECT_CODE = 1, // 代码段
SCF_SECT_DATA = 2, // 数据段
SCF_SECT_BSS = 3, // BSS段未初始化数据
SCF_SECT_RODATA = 4, // 只读数据
} scf_sect_type_t;
/** 重定位类型 */
typedef enum {
SCF_RELOC_ABS = 1, // 绝对地址
SCF_RELOC_REL = 2, // 相对地址
SCF_RELOC_PC = 3, // PC相对
} scf_reloc_type_t;
/**
* @brief SCF文件头
*/
typedef struct {
scf_byte_t magic[4]; // 魔数: "SCF\0"
scf_enum_t type; // 类型
scf_enum_t version; // 版本号
scf_enum_t arch; // 架构
scf_enum_t flags; // 标志位
scf_size_t entry_point; // 入口点地址
scf_size_t data_size;
scf_size_t code_size;
scf_size_t strtab_size;
scf_size_t sym_count;
scf_size_t reloc_count;
} scf_header_t;
/**
* @brief SCF段
*/
typedef struct {
scf_size_t size;
scf_enum_t scf_sect_type;
scf_byte_t data[1];
} scf_sect_t;
/**
* @brief SCF符号表
*/
typedef struct {
scf_size_t name_offset;
scf_enum_t scf_sym_type;
scf_enum_t scf_sym_bind;
scf_enum_t scf_sym_vis;
scf_enum_t scf_sect_type;
scf_size_t scf_sect_offset;
scf_size_t scf_sym_size;
} scf_sym_t;
/**
* @brief SCF重定向条目
*/
typedef struct {
scf_size_t offset; // 在段中的偏移量
scf_size_t sym_idx; // 符号索引
scf_enum_t type; // 重定位类型
scf_enum_t sect_type; // 段类型(代码段/数据段)
scf_size_t addend; // 加数
} scf_reloc_t;
/*
scf 文件结构
scf_header_t header;
scf_sect_data_t text;
scf_sect_data_t data;
scf_sect_data_t symtab;
scf_sect_data_t reloc;
scf_sect_data_t strtab;
*/
#ifdef __cplusplus
}
#endif
#endif /* __SCC_FORMAT_H__ */

View File

@@ -0,0 +1,42 @@
#ifndef __SCC_FORMAT_IMPL_H__
#define __SCC_FORMAT_IMPL_H__
#include <libcore.h>
#include <libutils.h>
#include <scf.h>
typedef SCC_VEC(u8) scf_sect_data_t;
typedef SCC_VEC(scf_sym_t) scf_sym_vec_t;
typedef SCC_VEC(scf_reloc_t) scf_reloc_vec_t;
typedef struct {
scf_header_t header;
scf_sect_data_t text;
scf_sect_data_t data;
scf_sect_data_t symtab;
scf_sect_data_t reloc;
scf_sect_data_t strtab;
scc_strpool_t strpool;
scc_hashtable_t str2idx;
scf_sym_vec_t syms;
scf_reloc_vec_t relocs;
} scf_t;
void scf_init(scf_t *scf);
cbool scf_parse(scf_t *scf, const char *buffer, usize size);
cbool scf_write(scf_t *scf, char *buffer, usize size);
cbool scf_exchange_section(scf_t *scf, scf_sect_type_t type,
scf_sect_data_t **section);
cbool scf_add_sym(scf_t *scf, scf_sym_t *sym);
cbool scf_add_reloc(scf_t *scf, scf_reloc_t *reloc);
cbool scf_apply_relocations(scf_t *scf);
cbool scf_write_done(scf_t *scf); // 在写入前进行内部整理
cbool scf_check_valid(scf_t *scf);
typedef SCC_VEC(scf_t) scf_vec_t;
cbool scf_link_all(scf_vec_t scfs, scf_t *outscf);
#endif /* __SCC_FORMAT_IMPL_H__ */

469
libs/format/src/scf.c Normal file
View File

@@ -0,0 +1,469 @@
/**
* @file scf.c
* @brief SCF 格式实现
*/
#include <scf_impl.h>
/**
* @brief 初始化 SCF 结构
* @param scf 指向 scf_t 结构的指针
*/
void scf_init(scf_t *scf) {
if (!scf) {
return;
}
scc_memset(&scf->header, 0, sizeof(scf->header));
scc_memcpy(scf->header.magic, SCF_MAGIC, 4);
scf->header.version = SCF_VERSION;
scf->header.arch = SCF_ARCH_UNKNOWN;
scf->header.flags = 0;
scf->header.entry_point = 0;
scf->header.data_size = 0;
scf->header.code_size = 0;
scf->header.strtab_size = 0;
scf->header.sym_count = 0;
scf->header.reloc_count = 0;
scc_vec_init(scf->text);
scc_vec_init(scf->data);
scc_vec_init(scf->symtab);
scc_vec_init(scf->reloc);
scc_vec_init(scf->strtab);
scc_strpool_init(&scf->strpool);
scf->str2idx.hash_func = (void *)scc_strhash32;
scf->str2idx.key_cmp = (void *)scc_strcmp;
scc_hashtable_init(&scf->str2idx);
scc_vec_init(scf->syms);
scc_vec_init(scf->relocs);
}
/**
* @brief 从缓冲区读取并解析 SCF 数据
* @param scf 指向 scf_t 结构的指针
* @param buffer 输入缓冲区
* @param size 缓冲区大小
* @return 成功返回 true失败返回 false
*/
cbool scf_parse(scf_t *scf, const char *buffer, usize size) {
if (!scf || !buffer || size < sizeof(scf_header_t)) {
return false;
}
// 读取头部
const scf_header_t *header = (const scf_header_t *)buffer;
if (scc_memcmp(header->magic, SCF_MAGIC, 4) != 0) {
return false;
}
scf->header = *header;
// 计算各段偏移
usize offset = sizeof(scf_header_t);
// 读取 text 段
if (scf->header.code_size > 0) {
if (offset + scf->header.code_size > size) {
return false;
}
// 调整 text 向量大小
while (scf->text.size < scf->header.code_size) {
scc_vec_push(scf->text, 0);
}
scc_memcpy(scf->text.data, buffer + offset, scf->header.code_size);
offset += scf->header.code_size;
}
// 读取 data 段
if (scf->header.data_size > 0) {
if (offset + scf->header.data_size > size) {
return false;
}
// 调整 data 向量大小
while (scf->data.size < scf->header.data_size) {
scc_vec_push(scf->data, 0);
}
scc_memcpy(scf->data.data, buffer + offset, scf->header.data_size);
offset += scf->header.data_size;
}
// 读取符号表
if (scf->header.sym_count > 0) {
usize symtab_size = scf->header.sym_count * sizeof(scf_sym_t);
if (offset + symtab_size > size) {
return false;
}
// 调整 syms 向量大小
while (scf->syms.size < scf->header.sym_count) {
scf_sym_t sym = {0};
scc_vec_push(scf->syms, sym);
}
scc_memcpy(scf->syms.data, buffer + offset, symtab_size);
offset += symtab_size;
}
// 读取重定位表
if (scf->header.reloc_count > 0) {
usize reloc_size = scf->header.reloc_count * sizeof(scf_reloc_t);
if (offset + reloc_size > size) {
return false;
}
// 调整 relocs 向量大小
while (scf->relocs.size < scf->header.reloc_count) {
scf_reloc_t reloc = {0};
scc_vec_push(scf->relocs, reloc);
}
scc_memcpy(scf->relocs.data, buffer + offset, reloc_size);
offset += reloc_size;
}
// 读取字符串表
if (scf->header.strtab_size > 0) {
if (offset + scf->header.strtab_size > size) {
return false;
}
// 调整 strtab 向量大小
while (scf->strtab.size < scf->header.strtab_size) {
scc_vec_push(scf->strtab, 0);
}
scc_memcpy(scf->strtab.data, buffer + offset, scf->header.strtab_size);
offset += scf->header.strtab_size;
}
// 允许 offset <= size因为缓冲区可能比实际数据大
if (offset > size) {
return false;
}
return true;
}
/**
* @brief 将 SCF 数据写入缓冲区
* @param scf 指向 scf_t 结构的指针
* @param buffer 输出缓冲区
* @param size 缓冲区大小
* @return 成功返回 true失败返回 false
*/
cbool scf_write(scf_t *scf, char *buffer, usize size) {
if (!scf || !buffer) {
return false;
}
// 计算所需大小
usize needed = sizeof(scf_header_t);
needed += scf->header.code_size; // text 段
needed += scf->header.data_size; // data 段
needed += scf->header.sym_count * sizeof(scf_sym_t);
needed += scf->header.reloc_count * sizeof(scf_reloc_t);
needed += scf->header.strtab_size;
if (size < needed) {
return false;
}
// 写入头部
scf_header_t *header = (scf_header_t *)buffer;
*header = scf->header;
usize offset = sizeof(scf_header_t);
// 写入 text 段
if (scf->header.code_size > 0) {
scc_memcpy(buffer + offset, scf->text.data, scf->header.code_size);
offset += scf->header.code_size;
}
// 写入 data 段
if (scf->header.data_size > 0) {
scc_memcpy(buffer + offset, scf->data.data, scf->header.data_size);
offset += scf->header.data_size;
}
// 写入符号表
if (scf->header.sym_count > 0) {
usize symtab_size = scf->header.sym_count * sizeof(scf_sym_t);
scc_memcpy(buffer + offset, scf->syms.data, symtab_size);
offset += symtab_size;
}
// 写入重定位表
if (scf->header.reloc_count > 0) {
usize reloc_size = scf->header.reloc_count * sizeof(scf_reloc_t);
scc_memcpy(buffer + offset, scf->relocs.data, reloc_size);
offset += reloc_size;
}
// 写入字符串表
if (scf->header.strtab_size > 0) {
scc_memcpy(buffer + offset, scf->strtab.data, scf->header.strtab_size);
offset += scf->header.strtab_size;
}
Assert(offset <= size);
return true;
}
/**
* @brief 交换段数据
* @param scf 指向 scf_t 结构的指针
* @param type 段类型
* @param section 指向新段数据的指针(双向交换)
* @return 成功返回 true失败返回 false
*/
cbool scf_exchange_section(scf_t *scf, scf_sect_type_t type,
scf_sect_data_t **section) {
if (!scf || !section || !*section) {
return false;
}
scf_sect_data_t *target_section = NULL;
scf_size_t *size_field = NULL;
// 根据类型选择目标段和大小字段
switch (type) {
case SCF_SECT_CODE:
target_section = &scf->text;
size_field = &scf->header.code_size;
break;
case SCF_SECT_DATA:
target_section = &scf->data;
size_field = &scf->header.data_size;
break;
case SCF_SECT_RODATA:
// 当前实现中没有单独的rodata段使用data段
target_section = &scf->data;
size_field = &scf->header.data_size;
break;
case SCF_SECT_BSS:
// BSS段不存储数据只记录大小
// 这里暂时不支持交换BSS段
return false;
default:
return false;
}
// 交换段数据
scf_sect_data_t temp = *target_section;
*target_section = **section;
**section = temp;
// 更新大小字段
if (size_field) {
// 保存旧大小(如果需要的话)
// scf_size_t temp_size = *size_field; // 未使用,注释掉
*size_field = (scf_size_t)(*target_section).size;
// 如果调用者需要知道新的大小,可以在这里设置
// 但当前API没有提供这个功能
}
return true;
}
/**
* @brief 添加符号到符号表
* @param scf 指向 scf_t 结构的指针
* @param sym 指向要添加的符号的指针
* @return 成功返回 true失败返回 false
*/
cbool scf_add_sym(scf_t *scf, scf_sym_t *sym) {
if (!scf || !sym) {
return false;
}
// 添加到符号向量
scc_vec_push(scf->syms, *sym);
scf->header.sym_count++;
return true;
}
/**
* @brief 添加重定位条目
* @param scf 指向 scf_t 结构的指针
* @param reloc 指向要添加的重定位条目的指针
* @return 成功返回 true失败返回 false
*/
cbool scf_add_reloc(scf_t *scf, scf_reloc_t *reloc) {
if (!scf || !reloc) {
return false;
}
// 添加到重定位向量
scc_vec_push(scf->relocs, *reloc);
scf->header.reloc_count++;
return true;
}
/**
* @brief 检查 SCF 结构是否有效
* @param scf 指向 scf_t 结构的指针
* @return 有效返回 true无效返回 false
*/
cbool scf_check_valid(scf_t *scf) {
if (!scf) {
return false;
}
// 检查魔数
if (scc_memcmp(scf->header.magic, SCF_MAGIC, 4) != 0) {
return false;
}
// 检查版本
if (scf->header.version != SCF_VERSION) {
return false;
}
// 检查架构
if (scf->header.arch > SCF_ARCH_X64) {
return false;
}
// 检查各段大小是否一致
if (scf->header.code_size != scf->text.size) {
return false;
}
if (scf->header.data_size != scf->data.size) {
return false;
}
if (scf->header.sym_count != scf->syms.size) {
return false;
}
if (scf->header.reloc_count != scf->relocs.size) {
return false;
}
if (scf->header.strtab_size != scf->strtab.size) {
return false;
}
return true;
}
/**
* @brief 应用重定位到段数据
* @param scf 指向 scf_t 结构的指针
* @return 成功返回 true失败返回 false
*/
cbool scf_apply_relocations(scf_t *scf) {
if (!scf) {
return false;
}
// 遍历所有重定位条目
for (usize i = 0; i < scf->relocs.size; i++) {
scf_reloc_t *reloc = &scf->relocs.data[i];
// 根据段类型选择目标段
scf_sect_data_t *target_section = NULL;
switch (reloc->sect_type) {
case SCF_SECT_CODE:
target_section = &scf->text;
break;
case SCF_SECT_DATA:
case SCF_SECT_RODATA:
target_section = &scf->data;
break;
default:
// 不支持的段类型
continue;
}
// 检查偏移量是否有效
if (reloc->offset + sizeof(scf_size_t) > target_section->size) {
// 偏移量超出段范围
return false;
}
// 获取符号地址
scf_size_t symbol_address = 0;
if (reloc->sym_idx < scf->syms.size) {
scf_sym_t *sym = &scf->syms.data[reloc->sym_idx];
symbol_address = sym->scf_sect_offset;
}
// 计算重定位值
scf_size_t relocation_value = symbol_address + reloc->addend;
// 根据重定位类型应用
scf_size_t *target =
(scf_size_t *)(target_section->data + reloc->offset);
switch (reloc->type) {
case SCF_RELOC_ABS:
// 绝对地址:直接替换
*target = relocation_value;
break;
case SCF_RELOC_REL:
// 相对地址:计算相对于当前位置的偏移
// 使用 uintptr_t 进行安全的指针到整数转换
*target = relocation_value - (scf_size_t)(uintptr_t)target;
break;
case SCF_RELOC_PC:
// PC相对计算相对于PC的偏移
// 使用 uintptr_t 进行安全的指针到整数转换
*target = relocation_value - (scf_size_t)(uintptr_t)(target + 1);
break;
default:
// 不支持的重定位类型
return false;
}
}
return true;
}
/**
* @brief 在写入前进行内部整理
* @param scf 指向 scf_t 结构的指针
* @return 成功返回 true失败返回 false
*/
cbool scf_write_done(scf_t *scf) {
if (!scf) {
return false;
}
// 应用所有重定位
if (!scf_apply_relocations(scf)) {
return false;
}
// 更新头部中的大小字段
scf->header.code_size = (scf_size_t)scf->text.size;
scf->header.data_size = (scf_size_t)scf->data.size;
scf->header.sym_count = (scf_size_t)scf->syms.size;
scf->header.reloc_count = (scf_size_t)scf->relocs.size;
scf->header.strtab_size = (scf_size_t)scf->strtab.size;
// 设置标志位为内部链接后的可执行文件
scf->header.flags |= SCF_FLAG_EXE_RELOC;
return true;
}
/**
* @brief 链接多个 SCF 文件
* @param scfs 包含多个 SCF 文件的向量
* @param outscf 输出链接后的 SCF 文件
* @return 成功返回 true失败返回 false
*/
cbool scf_link_all(scf_vec_t scfs, scf_t *outscf) {
if (!outscf || scfs.size == 0) {
return false;
}
// 初始化输出 SCF
scf_init(outscf);
// 简单实现:只链接第一个文件
// 实际实现应该合并所有文件的段、解析符号引用、应用重定位等
if (scfs.size > 0) {
scf_t *first = &scfs.data[0];
// 这里应该进行深拷贝,但为了简单起见,我们只复制头部
outscf->header = first->header;
}
return true;
}

View File

@@ -0,0 +1,109 @@
/**
* @file test_scf.c
* @brief SCF format tests
*/
#include <scf_impl.h>
#include <stdio.h>
#include <string.h>
int main() {
printf("Testing SCF format implementation...\n");
// Test 1: Initialization
scf_t scf;
scf_init(&scf);
if (memcmp(scf.header.magic, SCF_MAGIC, 4) != 0) {
printf("FAIL: Magic number incorrect\n");
return 1;
}
if (scf.header.version != SCF_VERSION) {
printf("FAIL: Version incorrect\n");
return 1;
}
if (scf.header.arch != SCF_ARCH_UNKNOWN) {
printf("FAIL: Architecture incorrect\n");
return 1;
}
printf("Test 1 PASSED: Initialization successful\n");
// Test 2: Adding symbols
scf_sym_t sym = {0};
sym.name_offset = 0;
sym.scf_sym_type = SCF_SYM_TYPE_FUNC;
sym.scf_sym_bind = SCF_SYM_BIND_GLOBAL;
sym.scf_sym_vis = SCF_SYM_VIS_DEFAULT;
sym.scf_sect_type = SCF_SECT_CODE;
sym.scf_sect_offset = 0;
sym.scf_sym_size = 16;
if (!scf_add_sym(&scf, &sym)) {
printf("FAIL: Cannot add symbol\n");
return 1;
}
if (scf.header.sym_count != 1) {
printf("FAIL: Symbol count incorrect\n");
return 1;
}
printf("Test 2 PASSED: Symbol addition successful\n");
// Test 3: Adding relocations
scf_reloc_t reloc = {0};
reloc.offset = 0; // 偏移量
reloc.sym_idx = 0;
reloc.type = SCF_RELOC_ABS;
reloc.sect_type = SCF_SECT_CODE; // 代码段
reloc.addend = 0;
if (!scf_add_reloc(&scf, &reloc)) {
printf("FAIL: Cannot add relocation\n");
return 1;
}
if (scf.header.reloc_count != 1) {
printf("FAIL: Relocation count incorrect\n");
return 1;
}
printf("Test 3 PASSED: Relocation addition successful\n");
// Test 4: Checking validity
if (!scf_check_valid(&scf)) {
printf("FAIL: SCF structure invalid\n");
return 1;
}
printf("Test 4 PASSED: SCF structure valid\n");
// Test 5: Writing and reading
char buffer[1024];
if (!scf_write(&scf, buffer, sizeof(buffer))) {
printf("FAIL: Cannot write to buffer\n");
return 1;
}
scf_t scf2;
scf_init(&scf2);
if (!scf_parse(&scf2, buffer, sizeof(buffer))) {
printf("FAIL: Cannot read from buffer\n");
return 1;
}
// Compare the two structures
if (memcmp(&scf.header, &scf2.header, sizeof(scf_header_t)) != 0) {
printf("FAIL: Header mismatch\n");
return 1;
}
printf("Test 5 PASSED: Write/read successful\n");
printf("All tests passed!\n");
return 0;
}

View File

@@ -0,0 +1,184 @@
/**
* @file test_scf_x64.c
* @brief SCF x64 architecture tests
*
* This test creates a simple x64 executable similar to the Rust example
* provided by the user.
*/
#include <scf_impl.h>
#include <stdio.h>
#include <string.h>
int main() {
printf("Testing SCF x64 format implementation...\n");
// Test 1: Initialize SCF for x64 architecture
scf_t scf;
scf_init(&scf);
// Set architecture to x64
scf.header.arch = SCF_ARCH_X64;
scf.header.flags = SCF_FLAG_EXE_RELOC;
printf("Test 1 PASSED: x64 initialization successful\n");
// Test 2: Add .text section with x64 machine code
// x64 machine code for:
// sub rsp, 0x28
// lea rcx, [rip + data_offset] ; will be relocated
// call [rip + printf_iat] ; will be relocated
// add rsp, 0x28
// xor eax, eax
// ret
unsigned char x64_code[] = {
0x48, 0x83, 0xEC, 0x28, // sub rsp, 0x28
0x48, 0x8D, 0x0D, 0x00, 0x00, 0x00,
0x00, // lea rcx, [rip + 0] (to be relocated)
0xFF, 0x15, 0x00, 0x00, 0x00, 0x00, // call [rip + 0] (to be relocated)
0x48, 0x83, 0xC4, 0x28, // add rsp, 0x28
0x33, 0xC0, // xor eax, eax
0xC3 // ret
};
// Add code to .text section
for (usize i = 0; i < sizeof(x64_code); i++) {
scc_vec_push(scf.text, x64_code[i]);
}
scf.header.code_size = (scf_size_t)scf.text.size;
printf("Test 2 PASSED: x64 code added to .text section\n");
// Test 3: Add .data section with string
const char hello_world[] = "Hello, World from SCF x64 Test!\n\0";
for (usize i = 0; i < sizeof(hello_world); i++) {
scc_vec_push(scf.data, hello_world[i]);
}
scf.header.data_size = (scf_size_t)scf.data.size;
printf("Test 3 PASSED: Data string added to .data section\n");
// Test 4: Add symbols
scf_sym_t data_sym = {0};
data_sym.name_offset = 0; // Would need string table for actual names
data_sym.scf_sym_type = SCF_SYM_TYPE_DATA;
data_sym.scf_sym_bind = SCF_SYM_BIND_GLOBAL;
data_sym.scf_sym_vis = SCF_SYM_VIS_DEFAULT;
data_sym.scf_sect_type = SCF_SECT_DATA;
data_sym.scf_sect_offset = 0; // Start of data section
data_sym.scf_sym_size = sizeof(hello_world);
if (!scf_add_sym(&scf, &data_sym)) {
printf("FAIL: Cannot add data symbol\n");
return 1;
}
scf_sym_t code_sym = {0};
code_sym.name_offset = 0;
code_sym.scf_sym_type = SCF_SYM_TYPE_FUNC;
code_sym.scf_sym_bind = SCF_SYM_BIND_GLOBAL;
code_sym.scf_sym_vis = SCF_SYM_VIS_DEFAULT;
code_sym.scf_sect_type = SCF_SECT_CODE;
code_sym.scf_sect_offset = 0; // Start of code section
code_sym.scf_sym_size = sizeof(x64_code);
if (!scf_add_sym(&scf, &code_sym)) {
printf("FAIL: Cannot add code symbol\n");
return 1;
}
printf("Test 4 PASSED: Symbols added\n");
// Test 5: Add relocations
// First relocation: data reference at offset 7 in code (lea rcx, [rip +
// data_offset])
scf_reloc_t data_reloc = {0};
data_reloc.offset = 7; // Offset in code section
data_reloc.sym_idx = 0; // Index of data symbol (first symbol added)
data_reloc.type = SCF_RELOC_PC; // PC-relative relocation
data_reloc.sect_type = SCF_SECT_CODE;
data_reloc.addend = -4; // RIP-relative addressing adjustment
if (!scf_add_reloc(&scf, &data_reloc)) {
printf("FAIL: Cannot add data relocation\n");
return 1;
}
// Second relocation: external function reference at offset 13 in code (call
// [rip + printf_iat])
scf_reloc_t func_reloc = {0};
func_reloc.offset = 13; // Offset in code section
func_reloc.sym_idx =
1; // Index of code symbol (would be external in real case)
func_reloc.type = SCF_RELOC_PC; // PC-relative relocation
func_reloc.sect_type = SCF_SECT_CODE;
func_reloc.addend = -4; // RIP-relative addressing adjustment
if (!scf_add_reloc(&scf, &func_reloc)) {
printf("FAIL: Cannot add function relocation\n");
return 1;
}
printf("Test 5 PASSED: Relocations added\n");
// Test 6: Apply relocations
if (!scf_apply_relocations(&scf)) {
printf("FAIL: Cannot apply relocations\n");
return 1;
}
printf("Test 6 PASSED: Relocations applied\n");
// Test 7: Prepare for writing (internal整理)
if (!scf_write_done(&scf)) {
printf("FAIL: Cannot prepare for writing\n");
return 1;
}
printf("Test 7 PASSED: Prepared for writing\n");
// Test 8: Write to buffer
char buffer[4096];
if (!scf_write(&scf, buffer, sizeof(buffer))) {
printf("FAIL: Cannot write to buffer\n");
return 1;
}
printf("Test 8 PASSED: Written to buffer\n");
// Test 9: Parse from buffer
scf_t scf2;
scf_init(&scf2);
if (!scf_parse(&scf2, buffer, sizeof(buffer))) {
printf("FAIL: Cannot parse from buffer\n");
return 1;
}
// Verify architecture
if (scf2.header.arch != SCF_ARCH_X64) {
printf("FAIL: Architecture not preserved\n");
return 1;
}
printf("Test 9 PASSED: Parsed from buffer, architecture preserved\n");
// Test 10: Verify structure
if (!scf_check_valid(&scf2)) {
printf("FAIL: Parsed structure invalid\n");
return 1;
}
printf("Test 10 PASSED: Parsed structure valid\n");
printf("\nAll x64 tests passed!\n");
printf("Created SCF file with:\n");
printf(" Architecture: x64\n");
printf(" Code size: %u bytes\n", scf.header.code_size);
printf(" Data size: %u bytes\n", scf.header.data_size);
printf(" Symbols: %u\n", scf.header.sym_count);
printf(" Relocations: %u\n", scf.header.reloc_count);
printf(" Flags: 0x%x\n", scf.header.flags);
return 0;
}

View File

@@ -1,5 +1,5 @@
[package]
name = "smcc_lex_parser"
name = "scc_lex_parser"
version = "0.1.0"
dependencies = [{ name = "libcore", path = "../../runtime/libcore" }]
dependencies = [{ name = "scc_core", path = "../../runtime/scc_core" }]

View File

@@ -1,26 +1,32 @@
#ifndef __SMCC_LEX_PARSER_H__
#define __SMCC_LEX_PARSER_H__
#ifndef __SCC_LEX_PARSER_H__
#define __SCC_LEX_PARSER_H__
#include <libcore.h>
#include <scc_core.h>
static inline cbool lex_parse_is_endline(int ch) {
static inline cbool scc_lex_parse_is_endline(int ch) {
return ch == '\n' || ch == '\r';
}
static inline cbool lex_parse_is_whitespace(int ch) {
static inline cbool scc_lex_parse_is_whitespace(int ch) {
return ch == ' ' || ch == '\t';
}
int lex_parse_char(scc_probe_stream_t *input, scc_pos_t *pos);
cbool lex_parse_string(scc_probe_stream_t *input, scc_pos_t *pos,
scc_cstring_t *output);
cbool lex_parse_number(scc_probe_stream_t *input, scc_pos_t *pos,
usize *output);
cbool lex_parse_identifier(scc_probe_stream_t *input, scc_pos_t *pos,
scc_cstring_t *output);
void lex_parse_skip_endline(scc_probe_stream_t *input, scc_pos_t *pos);
void lex_parse_skip_block_comment(scc_probe_stream_t *input, scc_pos_t *pos);
void lex_parse_skip_line(scc_probe_stream_t *input, scc_pos_t *pos);
void lex_parse_skip_whitespace(scc_probe_stream_t *input, scc_pos_t *pos);
// TODO identifier check is right?
static inline cbool scc_lex_parse_is_identifier_prefix(int ch) {
return (ch >= 'a' && ch <= 'z') || (ch >= 'A' && ch <= 'Z') || ch == '_';
}
#endif /* __SMCC_LEX_PARSER_H__ */
int scc_lex_parse_char(scc_probe_stream_t *input, scc_pos_t *pos);
cbool scc_lex_parse_string(scc_probe_stream_t *input, scc_pos_t *pos,
scc_cstring_t *output);
cbool scc_lex_parse_number(scc_probe_stream_t *input, scc_pos_t *pos,
usize *output);
cbool scc_lex_parse_identifier(scc_probe_stream_t *input, scc_pos_t *pos,
scc_cstring_t *output);
void scc_lex_parse_skip_endline(scc_probe_stream_t *input, scc_pos_t *pos);
void scc_lex_parse_skip_block_comment(scc_probe_stream_t *input,
scc_pos_t *pos);
void scc_lex_parse_skip_line(scc_probe_stream_t *input, scc_pos_t *pos);
void scc_lex_parse_skip_whitespace(scc_probe_stream_t *input, scc_pos_t *pos);
#endif /* __SCC_LEX_PARSER_H__ */

View File

@@ -1,19 +1,19 @@
#include <lex_parser.h>
void lex_parse_skip_endline(scc_probe_stream_t *input, scc_pos_t *pos) {
void scc_lex_parse_skip_endline(scc_probe_stream_t *input, scc_pos_t *pos) {
Assert(input != null && pos != null);
scc_probe_stream_reset(input);
// scc_probe_stream_reset(input);
int ch = scc_probe_stream_peek(input);
if (ch == '\r') {
scc_probe_stream_consume(input);
scc_probe_stream_next(input);
ch = scc_probe_stream_peek(input);
if (ch == '\n') {
scc_probe_stream_consume(input);
scc_probe_stream_next(input);
}
core_pos_next_line(pos);
scc_pos_next_line(pos);
} else if (ch == '\n') {
scc_probe_stream_consume(input);
core_pos_next_line(pos);
scc_probe_stream_next(input);
scc_pos_next_line(pos);
} else {
LOG_WARN("not a newline character");
}
@@ -57,81 +57,82 @@ static inline int got_simple_escape(int ch) {
/* clang-format on */
}
void lex_parse_skip_line(scc_probe_stream_t *input, scc_pos_t *pos) {
void scc_lex_parse_skip_line(scc_probe_stream_t *input, scc_pos_t *pos) {
scc_probe_stream_t *stream = input;
Assert(stream != null && pos != null);
scc_probe_stream_reset(stream);
// scc_probe_stream_reset(stream);
while (1) {
int ch = scc_probe_stream_peek(stream);
if (ch == core_stream_eof) {
if (ch == scc_stream_eof) {
return;
}
// TODO endline
if (lex_parse_is_endline(ch)) {
lex_parse_skip_endline(stream, pos);
if (scc_lex_parse_is_endline(ch)) {
scc_lex_parse_skip_endline(stream, pos);
return;
} else {
scc_probe_stream_consume(stream);
core_pos_next(pos);
scc_probe_stream_next(stream);
scc_pos_next(pos);
}
}
}
void lex_parse_skip_block_comment(scc_probe_stream_t *input, scc_pos_t *pos) {
void scc_lex_parse_skip_block_comment(scc_probe_stream_t *input,
scc_pos_t *pos) {
scc_probe_stream_t *stream = input;
Assert(stream != null && pos != null);
int ch;
scc_probe_stream_reset(stream);
ch = scc_probe_stream_consume(stream);
core_pos_next(pos);
// scc_probe_stream_reset(stream);
ch = scc_probe_stream_next(stream);
scc_pos_next(pos);
// FIXME Assertion
Assert(ch == '/');
ch = scc_probe_stream_consume(stream);
core_pos_next(pos);
ch = scc_probe_stream_next(stream);
scc_pos_next(pos);
Assert(ch == '*');
// all ready match `/*`
while (1) {
scc_probe_stream_reset(stream);
// scc_probe_stream_reset(stream);
ch = scc_probe_stream_peek(stream);
if (ch == core_stream_eof) {
if (ch == scc_stream_eof) {
LOG_WARN("Unterminated block comment");
return;
}
if (lex_parse_is_endline(ch)) {
lex_parse_skip_endline(stream, pos);
if (scc_lex_parse_is_endline(ch)) {
scc_lex_parse_skip_endline(stream, pos);
continue;
}
scc_probe_stream_consume(stream);
core_pos_next(pos);
scc_probe_stream_next(stream);
scc_pos_next(pos);
if (ch == '*') {
ch = scc_probe_stream_peek(stream);
if (ch == '/') {
scc_probe_stream_consume(stream);
core_pos_next(pos);
scc_probe_stream_next(stream);
scc_pos_next(pos);
return;
}
}
}
}
void lex_parse_skip_whitespace(scc_probe_stream_t *input, scc_pos_t *pos) {
void scc_lex_parse_skip_whitespace(scc_probe_stream_t *input, scc_pos_t *pos) {
scc_probe_stream_t *stream = input;
Assert(stream != null && pos != null);
scc_probe_stream_reset(stream);
// scc_probe_stream_reset(stream);
while (1) {
int ch = scc_probe_stream_peek(stream);
if (!lex_parse_is_whitespace(ch)) {
if (!scc_lex_parse_is_whitespace(ch)) {
return;
}
scc_probe_stream_consume(stream);
core_pos_next(pos);
scc_probe_stream_next(stream);
scc_pos_next(pos);
}
}
@@ -142,14 +143,14 @@ static inline cbool _lex_parse_uint(scc_probe_stream_t *input, scc_pos_t *pos,
return false;
}
Assert(base == 2 || base == 8 || base == 10 || base == 16);
scc_probe_stream_reset(input);
// scc_probe_stream_reset(input);
int ch, tmp;
usize n = 0;
usize offset = pos->offset;
while (1) {
ch = scc_probe_stream_peek(input);
if (ch == core_stream_eof) {
if (ch == scc_stream_eof) {
break;
} else if (ch >= 'a' && ch <= 'z') {
tmp = ch - 'a' + 10;
@@ -166,8 +167,8 @@ static inline cbool _lex_parse_uint(scc_probe_stream_t *input, scc_pos_t *pos,
return false;
}
scc_probe_stream_consume(input);
core_pos_next(pos);
scc_probe_stream_next(input);
scc_pos_next(pos);
n = n * base + tmp;
// TODO number overflow
}
@@ -187,32 +188,31 @@ static inline cbool _lex_parse_uint(scc_probe_stream_t *input, scc_pos_t *pos,
* @return int
* https://cppreference.cn/w/c/language/character_constant
*/
int lex_parse_char(scc_probe_stream_t *input, scc_pos_t *pos) {
int scc_lex_parse_char(scc_probe_stream_t *input, scc_pos_t *pos) {
scc_probe_stream_t *stream = input;
Assert(stream != null && pos != null);
scc_probe_stream_reset(stream);
int ch = scc_probe_stream_peek(stream);
int ret = core_stream_eof;
int ch = scc_probe_stream_next(stream);
scc_pos_next(pos);
int ret = scc_stream_eof;
if (ch == core_stream_eof) {
if (ch == scc_stream_eof) {
LOG_WARN("Unexpected EOF at begin");
goto ERR;
} else if (ch != '\'') {
LOG_WARN("Unexpected character '%c' at begin", ch);
goto ERR;
}
scc_probe_stream_consume(stream);
core_pos_next(pos);
// scc_probe_stream_next(stream);
ch = scc_probe_stream_consume(stream);
core_pos_next(pos);
ch = scc_probe_stream_next(stream);
scc_pos_next(pos);
if (ch == core_stream_eof) {
if (ch == scc_stream_eof) {
LOG_WARN("Unexpected EOF at middle");
goto ERR;
} else if (ch == '\\') {
ch = scc_probe_stream_consume(stream);
core_pos_next(pos);
ch = scc_probe_stream_next(stream);
scc_pos_next(pos);
if (ch == '0') {
// 数字转义序列
// \nnn 任意八进制值 码元 nnn
@@ -237,15 +237,15 @@ int lex_parse_char(scc_probe_stream_t *input, scc_pos_t *pos) {
} else {
ret = ch;
}
if ((ch = scc_probe_stream_consume(stream)) != '\'') {
if ((ch = scc_probe_stream_next(stream)) != '\'') {
LOG_ERROR("Unclosed character literal '%c' at end, expect `'`", ch);
core_pos_next(pos);
scc_pos_next(pos);
goto ERR;
}
return ret;
ERR:
return core_stream_eof;
return scc_stream_eof;
}
/**
@@ -257,38 +257,38 @@ ERR:
* @return cbool
* https://cppreference.cn/w/c/language/string_literal
*/
cbool lex_parse_string(scc_probe_stream_t *input, scc_pos_t *pos,
scc_cstring_t *output) {
cbool scc_lex_parse_string(scc_probe_stream_t *input, scc_pos_t *pos,
scc_cstring_t *output) {
scc_probe_stream_t *stream = input;
Assert(stream != null && pos != null && output != null);
scc_probe_stream_reset(stream);
// scc_probe_stream_reset(stream);
int ch = scc_probe_stream_peek(stream);
Assert(scc_cstring_is_empty(output));
if (ch == core_stream_eof) {
if (ch == scc_stream_eof) {
LOG_WARN("Unexpected EOF at begin");
goto ERR;
} else if (ch != '"') {
LOG_WARN("Unexpected character '%c' at begin", ch);
goto ERR;
}
scc_probe_stream_consume(stream);
core_pos_next(pos);
scc_probe_stream_next(stream);
scc_pos_next(pos);
scc_cstring_t str = scc_cstring_from_cstr("");
while (1) {
ch = scc_probe_stream_peek(stream);
if (ch == core_stream_eof) {
if (ch == scc_stream_eof) {
LOG_ERROR("Unexpected EOF at string literal");
goto ERR;
} else if (lex_parse_is_endline(ch)) {
} else if (scc_lex_parse_is_endline(ch)) {
LOG_ERROR("Unexpected newline at string literal");
goto ERR;
} else if (ch == '\\') {
// TODO bad practice and maybe bugs here
scc_probe_stream_consume(stream);
ch = scc_probe_stream_consume(stream);
scc_probe_stream_next(stream);
ch = scc_probe_stream_next(stream);
int val = got_simple_escape(ch);
if (val == -1) {
LOG_ERROR("Invalid escape character it is \\%c [%d]", ch, ch);
@@ -297,13 +297,13 @@ cbool lex_parse_string(scc_probe_stream_t *input, scc_pos_t *pos,
continue;
}
} else if (ch == '"') {
scc_probe_stream_consume(stream);
core_pos_next(pos);
scc_probe_stream_next(stream);
scc_pos_next(pos);
break;
}
scc_probe_stream_consume(stream);
core_pos_next(pos);
scc_probe_stream_next(stream);
scc_pos_next(pos);
scc_cstring_append_ch(&str, ch);
}
@@ -323,36 +323,36 @@ ERR:
* @return cbool
* https://cppreference.cn/w/c/language/integer_constant
*/
cbool lex_parse_number(scc_probe_stream_t *input, scc_pos_t *pos,
usize *output) {
cbool scc_lex_parse_number(scc_probe_stream_t *input, scc_pos_t *pos,
usize *output) {
scc_probe_stream_t *stream = input;
Assert(stream != null && pos != null && output != null);
scc_probe_stream_reset(stream);
// scc_probe_stream_reset(stream);
int ch = scc_probe_stream_peek(stream);
int base = 10; // 默认十进制
if (ch == core_stream_eof) {
if (ch == scc_stream_eof) {
LOG_WARN("Unexpected EOF at begin");
goto ERR;
}
if (ch == '0') {
// 消费 '0'
scc_probe_stream_consume(stream);
core_pos_next(pos);
scc_probe_stream_next(stream);
scc_pos_next(pos);
// 查看下一个字符
ch = scc_probe_stream_peek(stream);
if (ch == 'x' || ch == 'X') {
// 十六进制
base = 16;
scc_probe_stream_consume(stream);
core_pos_next(pos);
scc_probe_stream_next(stream);
scc_pos_next(pos);
} else if (ch == 'b' || ch == 'B') {
// 二进制 (C23扩展)
base = 2;
scc_probe_stream_consume(stream);
core_pos_next(pos);
scc_probe_stream_next(stream);
scc_pos_next(pos);
} else if (ch >= '0' && ch <= '7') {
// 八进制
base = 8;
@@ -374,7 +374,7 @@ cbool lex_parse_number(scc_probe_stream_t *input, scc_pos_t *pos,
}
// 解析整数部分
scc_probe_stream_reset(stream);
// scc_probe_stream_reset(stream);
usize n;
if (_lex_parse_uint(stream, pos, base, &n) == false) {
// 如果没有匹配任何数字,但输入是 '0',已经处理过了
@@ -383,8 +383,8 @@ cbool lex_parse_number(scc_probe_stream_t *input, scc_pos_t *pos,
// 单个数字的情况,例如 "1"
// 我们需要消费这个数字并返回它的值
if (ch >= '1' && ch <= '9') {
scc_probe_stream_consume(stream);
core_pos_next(pos);
scc_probe_stream_next(stream);
scc_pos_next(pos);
*output = ch - '0';
return true;
}
@@ -406,22 +406,21 @@ ERR:
* @return cbool
* https://cppreference.cn/w/c/language/identifier
*/
cbool lex_parse_identifier(scc_probe_stream_t *input, scc_pos_t *pos,
scc_cstring_t *output) {
cbool scc_lex_parse_identifier(scc_probe_stream_t *input, scc_pos_t *pos,
scc_cstring_t *output) {
Assert(input != null && pos != null && output != null);
Assert(scc_cstring_is_empty(output));
scc_probe_stream_t *stream = input;
scc_probe_stream_reset(stream);
// scc_probe_stream_reset(stream);
int ch = scc_probe_stream_peek(stream);
if (ch == core_stream_eof) {
if (ch == scc_stream_eof) {
LOG_WARN("Unexpected EOF at begin");
} else if (ch == '_' || (ch >= 'a' && ch <= 'z') ||
(ch >= 'A' && ch <= 'Z')) {
} else if (scc_lex_parse_is_identifier_prefix(ch)) {
while (1) {
scc_cstring_append_ch(output, ch);
scc_probe_stream_consume(stream);
core_pos_next(pos);
scc_probe_stream_next(stream);
scc_pos_next(pos);
ch = scc_probe_stream_peek(stream);
if ((ch >= 'a' && ch <= 'z') || (ch >= 'A' && ch <= 'Z') ||
(ch == '_') || (ch >= '0' && ch <= '9')) {

View File

@@ -4,12 +4,16 @@
cbool check_char(const char *str, int expect, int *output) {
log_set_level(&__default_logger_root, 0);
scc_pos_t pos = scc_pos_init();
scc_pos_t pos = scc_pos_create();
scc_mem_probe_stream_t mem_stream;
scc_probe_stream_t *stream =
scc_mem_probe_stream_init(&mem_stream, str, scc_strlen(str), false);
*output = lex_parse_char(stream, &pos);
return *output == expect;
*output = scc_lex_parse_char(stream, &pos);
cbool ret1 = *output == expect;
scc_probe_stream_reset(stream);
*output = scc_lex_parse_char(stream, &pos);
cbool ret2 = *output == expect;
return ret1 && ret2;
}
#define CHECK_CHAR_VALID(str, expect) \
@@ -22,8 +26,8 @@ cbool check_char(const char *str, int expect, int *output) {
#define CHECK_CHAR_INVALID(str) \
do { \
int _output; \
check_char(str, core_stream_eof, &_output); \
TEST_CHECK(_output == core_stream_eof); \
check_char(str, scc_stream_eof, &_output); \
TEST_CHECK(_output == scc_stream_eof); \
} while (0)
void test_simple_char(void) {

View File

@@ -5,12 +5,12 @@
cbool check_identifier(const char *str, const char *expect,
scc_cstring_t *output) {
log_set_level(&__default_logger_root, 0);
scc_pos_t pos = scc_pos_init();
scc_pos_t pos = scc_pos_create();
scc_mem_probe_stream_t mem_stream;
scc_probe_stream_t *stream =
scc_mem_probe_stream_init(&mem_stream, str, scc_strlen(str), false);
cbool ret = lex_parse_identifier(stream, &pos, output);
cbool ret = scc_lex_parse_identifier(stream, &pos, output);
if (ret && expect) {
return strcmp(output->data, expect) == 0;
}
@@ -19,7 +19,7 @@ cbool check_identifier(const char *str, const char *expect,
#define CHECK_IDENTIFIER_VALID(str, expect) \
do { \
scc_cstring_t _output = scc_cstring_new(); \
scc_cstring_t _output = scc_cstring_create(); \
cbool ret = check_identifier(str, expect, &_output); \
TEST_CHECK(ret == true); \
TEST_CHECK(strcmp(_output.data, expect) == 0); \
@@ -28,7 +28,7 @@ cbool check_identifier(const char *str, const char *expect,
#define CHECK_IDENTIFIER_INVALID(str) \
do { \
scc_cstring_t _output = scc_cstring_new(); \
scc_cstring_t _output = scc_cstring_create(); \
cbool ret = check_identifier(str, NULL, &_output); \
TEST_CHECK(ret == false); \
scc_cstring_free(&_output); \

View File

@@ -5,11 +5,11 @@ cbool check(const char *str, usize expect, usize *output) {
// TODO maybe have other logger
(void)(expect);
log_set_level(&__default_logger_root, 0);
scc_pos_t pos = scc_pos_init();
scc_pos_t pos = scc_pos_create();
scc_mem_probe_stream_t mem_stream;
scc_probe_stream_t *stream =
scc_mem_probe_stream_init(&mem_stream, str, scc_strlen(str), false);
return lex_parse_number(stream, &pos, output);
return scc_lex_parse_number(stream, &pos, output);
}
#define CHECK_VALID(str, expect) \

View File

@@ -4,18 +4,19 @@
void check_skip_block_comment(const char *str, const char *expect_remaining) {
log_set_level(&__default_logger_root, 0);
scc_pos_t pos = scc_pos_init();
scc_pos_t pos = scc_pos_create();
scc_mem_probe_stream_t mem_stream;
scc_probe_stream_t *stream =
scc_mem_probe_stream_init(&mem_stream, str, scc_strlen(str), false);
lex_parse_skip_block_comment(stream, &pos);
scc_lex_parse_skip_block_comment(stream, &pos);
scc_probe_stream_sync(stream);
// Check remaining content
char buffer[256] = {0};
int i = 0;
int ch;
while ((ch = scc_probe_stream_consume(stream)) != core_stream_eof &&
while ((ch = scc_probe_stream_consume(stream)) != scc_stream_eof &&
i < 255) {
buffer[i++] = (char)ch;
}

View File

@@ -4,18 +4,19 @@
void check_skip_line(const char *str, const char *expect_remaining) {
log_set_level(&__default_logger_root, 0);
scc_pos_t pos = scc_pos_init();
scc_pos_t pos = scc_pos_create();
scc_mem_probe_stream_t mem_stream;
scc_probe_stream_t *stream =
scc_mem_probe_stream_init(&mem_stream, str, scc_strlen(str), false);
lex_parse_skip_line(stream, &pos);
scc_lex_parse_skip_line(stream, &pos);
scc_probe_stream_sync(stream);
// Check remaining content
char buffer[256] = {0};
int i = 0;
int ch;
while ((ch = scc_probe_stream_consume(stream)) != core_stream_eof &&
while ((ch = scc_probe_stream_consume(stream)) != scc_stream_eof &&
i < 255) {
buffer[i++] = (char)ch;
}

View File

@@ -4,12 +4,12 @@
cbool check_string(const char *str, const char *expect, scc_cstring_t *output) {
log_set_level(&__default_logger_root, 0);
scc_pos_t pos = scc_pos_init();
scc_pos_t pos = scc_pos_create();
scc_mem_probe_stream_t mem_stream;
scc_probe_stream_t *stream =
scc_mem_probe_stream_init(&mem_stream, str, scc_strlen(str), false);
cbool ret = lex_parse_string(stream, &pos, output);
cbool ret = scc_lex_parse_string(stream, &pos, output);
if (ret && expect) {
return strcmp(output->data, expect) == 0;
}
@@ -18,7 +18,7 @@ cbool check_string(const char *str, const char *expect, scc_cstring_t *output) {
#define CHECK_STRING_VALID(str, expect) \
do { \
scc_cstring_t _output = scc_cstring_new(); \
scc_cstring_t _output = scc_cstring_create(); \
cbool ret = check_string(str, expect, &_output); \
TEST_CHECK(ret == true); \
TEST_CHECK(strcmp(_output.data, expect) == 0); \
@@ -27,7 +27,7 @@ cbool check_string(const char *str, const char *expect, scc_cstring_t *output) {
#define CHECK_STRING_INVALID(str) \
do { \
scc_cstring_t _output = scc_cstring_new(); \
scc_cstring_t _output = scc_cstring_create(); \
cbool ret = check_string(str, NULL, &_output); \
TEST_CHECK(ret == false); \
scc_cstring_free(&_output); \

View File

@@ -1,8 +1,8 @@
[package]
name = "smcc_lex"
name = "scc_lex"
version = "0.1.0"
dependencies = [
{ name = "libcore", path = "../../runtime/libcore" },
{ name = "smcc_lex_parser", path = "../lex_parser" },
{ name = "scc_core", path = "../../runtime/scc_core" },
{ name = "lex_parser", path = "../lex_parser" },
]

View File

@@ -7,20 +7,14 @@
#define __SCC_LEXER_H__
#include "lexer_token.h"
#include <libcore.h>
typedef struct lexer_token {
scc_tok_type_t type;
scc_cvalue_t value;
scc_pos_t loc;
} lexer_tok_t;
#include <scc_core.h>
/**
* @brief 词法分析器核心结构体
*
* 封装词法分析所需的状态信息和缓冲区管理
*/
typedef struct cc_lexer {
typedef struct scc_lexer {
scc_probe_stream_t *stream;
scc_pos_t pos;
} scc_lexer_t;
@@ -39,7 +33,7 @@ void scc_lexer_init(scc_lexer_t *lexer, scc_probe_stream_t *stream);
*
* 此函数会返回所有类型的token包括空白符等无效token
*/
void scc_lexer_get_token(scc_lexer_t *lexer, lexer_tok_t *token);
void scc_lexer_get_token(scc_lexer_t *lexer, scc_lexer_tok_t *token);
/**
* @brief 获取有效token
@@ -48,6 +42,63 @@ void scc_lexer_get_token(scc_lexer_t *lexer, lexer_tok_t *token);
*
* 此函数会自动跳过空白符等无效token返回对语法分析有意义的token
*/
void scc_lexer_get_valid_token(scc_lexer_t *lexer, lexer_tok_t *token);
void scc_lexer_get_valid_token(scc_lexer_t *lexer, scc_lexer_tok_t *token);
typedef SCC_VEC(scc_lexer_tok_t) scc_lexer_tok_vec_t;
typedef struct scc_lexer_stream scc_lexer_stream_t;
struct scc_lexer_stream {
scc_lexer_t *lexer;
scc_lexer_tok_vec_t toks; // 循环缓冲区
usize curr_pos; // 当前读取位置(逻辑位置)
usize probe_pos; // 已填充位置(逻辑位置)
cbool need_comment;
/// @brief 向前读取n个token
const scc_lexer_tok_t *(*peek)(scc_lexer_stream_t *stream, usize n);
/// @brief 指针推进到offset
void (*advance)(scc_lexer_stream_t *stream, usize offset);
/// @brief 销毁并释放资源
void (*drop)(scc_lexer_stream_t *stream);
};
/**
* @brief 将词法分析器转换成流式输出(自带缓冲区)
* @param[in] lexer 已经词法分析器实例
* @param[out] stream 输出流对象指针
* @param[in] need_comment 输出时是否需要注释
*/
void scc_lexer_to_stream(scc_lexer_t *lexer, scc_lexer_stream_t *stream,
cbool need_comment);
static inline const scc_lexer_tok_t *
scc_lexer_stream_current(scc_lexer_stream_t *stream) {
Assert(stream != null);
return stream->peek(stream, 0);
}
static inline const scc_lexer_tok_t *
scc_lexer_stream_peek(scc_lexer_stream_t *stream, usize n) {
Assert(stream != null);
return stream->peek(stream, n);
}
static inline void scc_lexer_stream_consume(scc_lexer_stream_t *stream) {
Assert(stream != null);
return stream->advance(stream, 1);
}
static inline void scc_lexer_stream_advance(scc_lexer_stream_t *stream,
usize n) {
Assert(stream != null);
return stream->advance(stream, n);
}
static inline void scc_lexer_stream_drop(scc_lexer_stream_t *stream) {
Assert(stream != null);
return stream->drop(stream);
}
#endif /* __SCC_LEXER_H__ */

View File

@@ -1,48 +1,48 @@
#ifndef __SMCC_LEXER_LOG_H__
#define __SMCC_LEXER_LOG_H__
#ifndef __SCC_LEXER_LOG_H__
#define __SCC_LEXER_LOG_H__
#include <libcore.h>
#include <scc_core.h>
#ifndef LEX_LOG_LEVEL
#define LEX_LOG_LEVEL 4
#endif
#if LEX_LOG_LEVEL <= 1
#define LEX_NOTSET(fmt, ...) MLOG_NOTSET(&__smcc_lexer_log, fmt, ##__VA_ARGS__)
#define LEX_NOTSET(fmt, ...) MLOG_NOTSET(&__scc_lexer_log, fmt, ##__VA_ARGS__)
#else
#define LEX_NOTSET(fmt, ...)
#endif
#if LEX_LOG_LEVEL <= 2
#define LEX_DEBUG(fmt, ...) MLOG_DEBUG(&__smcc_lexer_log, fmt, ##__VA_ARGS__)
#define LEX_DEBUG(fmt, ...) MLOG_DEBUG(&__scc_lexer_log, fmt, ##__VA_ARGS__)
#else
#define LEX_DEBUG(fmt, ...)
#endif
#if LEX_LOG_LEVEL <= 3
#define LEX_INFO(fmt, ...) MLOG_INFO(&__smcc_lexer_log, fmt, ##__VA_ARGS__)
#define LEX_INFO(fmt, ...) MLOG_INFO(&__scc_lexer_log, fmt, ##__VA_ARGS__)
#else
#define LEX_INFO(fmt, ...)
#endif
#if LEX_LOG_LEVEL <= 4
#define LEX_WARN(fmt, ...) MLOG_WARN(&__smcc_lexer_log, fmt, ##__VA_ARGS__)
#define LEX_WARN(fmt, ...) MLOG_WARN(&__scc_lexer_log, fmt, ##__VA_ARGS__)
#else
#define LEX_WARN(fmt, ...)
#endif
#if LEX_LOG_LEVEL <= 5
#define LEX_ERROR(fmt, ...) MLOG_ERROR(&__smcc_lexer_log, fmt, ##__VA_ARGS__)
#define LEX_ERROR(fmt, ...) MLOG_ERROR(&__scc_lexer_log, fmt, ##__VA_ARGS__)
#else
#define LEX_ERROR(fmt, ...)
#endif
#if LEX_LOG_LEVEL <= 6
#define LEX_FATAL(fmt, ...) MLOG_FATAL(&__smcc_lexer_log, fmt, ##__VA_ARGS__)
#define LEX_FATAL(fmt, ...) MLOG_FATAL(&__scc_lexer_log, fmt, ##__VA_ARGS__)
#else
#define LEX_FATAL(fmt, ...)
#endif
extern logger_t __smcc_lexer_log;
extern logger_t __scc_lexer_log;
#endif // __SMCC_LEXER_LOG_H__
#endif /* __SCC_LEXER_LOG_H__ */

View File

@@ -1,7 +1,7 @@
#ifndef __SMCC_CC_TOKEN_H__
#define __SMCC_CC_TOKEN_H__
#ifndef __SCC_LEXER_TOKEN_H__
#define __SCC_LEXER_TOKEN_H__
#include <libcore.h>
#include <scc_core.h>
typedef enum scc_cstd {
SCC_CSTD_C89,
@@ -137,4 +137,24 @@ typedef enum scc_tok_subtype {
scc_tok_subtype_t scc_get_tok_subtype(scc_tok_type_t type);
const char *scc_get_tok_name(scc_tok_type_t type);
#endif
typedef struct scc_lexer_token {
scc_tok_type_t type;
scc_cvalue_t value;
scc_pos_t loc;
} scc_lexer_tok_t;
static inline cbool scc_lexer_tok_match(const scc_lexer_tok_t *tok,
scc_tok_type_t type) {
return tok->type == type;
}
static inline cbool scc_lexer_tok_expect(const scc_lexer_tok_t *tok,
scc_tok_type_t type) {
if (!scc_lexer_tok_match(tok, type)) {
LOG_ERROR("expected token %d, got %d\n", type, tok->type);
return false;
}
return true;
}
#endif /* __SCC_LEXER_TOKEN_H__ */

View File

@@ -1,31 +1,3 @@
/**
* 仿照LCCompiler的词法分析部分
*
* 如下为LCC的README in 2025.2
This hierarchy is the distribution for lcc version 4.2.
lcc version 3.x is described in the book "A Retargetable C Compiler:
Design and Implementation" (Addison-Wesley, 1995, ISBN 0-8053-1670-1).
There are significant differences between 3.x and 4.x, most notably in
the intermediate code. For details, see
https://drh.github.io/lcc/documents/interface4.pdf.
VERSION 4.2 IS INCOMPATIBLE WITH EARLIER VERSIONS OF LCC. DO NOT
UNLOAD THIS DISTRIBUTION ON TOP OF A 3.X DISTRIBUTION.
LCC is a C89 ("ANSI C") compiler designed to be highly retargetable.
LOG describes the changes since the last release.
CPYRIGHT describes the conditions under you can use, copy, modify, and
distribute lcc or works derived from lcc.
doc/install.html is an HTML file that gives a complete description of
the distribution and installation instructions.
Chris Fraser / cwf@aya.yale.edu
David Hanson / drh@drhanson.net
*/
#include <lex_parser.h>
#include <lexer.h>
#include <lexer_log.h>
@@ -77,23 +49,23 @@ static inline int keyword_cmp(const char *name, int len) {
void scc_lexer_init(scc_lexer_t *lexer, scc_probe_stream_t *stream) {
lexer->stream = stream;
lexer->pos = scc_pos_init();
lexer->pos = scc_pos_create();
// FIXME
lexer->pos.name = scc_cstring_from_cstr(scc_cstring_as_cstr(&stream->name));
lexer->pos.name = scc_cstring_copy(&stream->name);
}
#define set_err_token(token) ((token)->type = SCC_TOK_UNKNOWN)
static void parse_line(scc_lexer_t *lexer, lexer_tok_t *token) {
static void parse_line(scc_lexer_t *lexer, scc_lexer_tok_t *token) {
token->loc = lexer->pos;
scc_probe_stream_t *stream = lexer->stream;
scc_probe_stream_reset(stream);
int ch = scc_probe_stream_next(stream);
usize n;
scc_cstring_t str = scc_cstring_new();
scc_cstring_t str = scc_cstring_create();
if (ch == core_stream_eof) {
if (ch == scc_stream_eof) {
LEX_WARN("Unexpected EOF at begin");
goto ERR;
} else if (ch != '#') {
@@ -105,7 +77,7 @@ static void parse_line(scc_lexer_t *lexer, lexer_tok_t *token) {
for (int i = 0; i < (int)sizeof(line); i++) {
ch = scc_probe_stream_consume(stream);
core_pos_next(&lexer->pos);
scc_pos_next(&lexer->pos);
if (ch != line[i]) {
LEX_WARN("Maroc does not support in lexer rather in preprocessor, "
"it will be ignored");
@@ -113,13 +85,13 @@ static void parse_line(scc_lexer_t *lexer, lexer_tok_t *token) {
}
}
if (lex_parse_number(lexer->stream, &lexer->pos, &n) == false) {
if (scc_lex_parse_number(stream, &lexer->pos, &n) == false) {
LEX_ERROR("Invalid line number");
goto SKIP_LINE;
}
if (scc_probe_stream_consume(stream) != ' ') {
lex_parse_skip_line(lexer->stream, &lexer->pos);
scc_lex_parse_skip_line(stream, &lexer->pos);
token->loc.line = token->value.n;
}
@@ -127,26 +99,28 @@ static void parse_line(scc_lexer_t *lexer, lexer_tok_t *token) {
LEX_ERROR("Invalid `#` line");
goto SKIP_LINE;
}
if (lex_parse_string(lexer->stream, &lexer->pos, &str) == false) {
if (scc_lex_parse_string(stream, &lexer->pos, &str) == false) {
LEX_ERROR("Invalid filename");
goto SKIP_LINE;
}
lex_parse_skip_line(lexer->stream, &lexer->pos);
scc_lex_parse_skip_line(stream, &lexer->pos);
scc_probe_stream_sync(stream);
token->loc.line = n;
// FIXME memory leak
token->loc.name = scc_cstring_from_cstr(scc_cstring_as_cstr(&str));
token->loc.name = scc_cstring_copy(&str);
scc_cstring_free(&str);
return;
SKIP_LINE:
lex_parse_skip_line(lexer->stream, &lexer->pos);
scc_lex_parse_skip_line(stream, &lexer->pos);
scc_probe_stream_sync(stream);
ERR:
set_err_token(token);
scc_cstring_free(&str);
}
// /zh/c/language/operator_arithmetic.html
void scc_lexer_get_token(scc_lexer_t *lexer, lexer_tok_t *token) {
void scc_lexer_get_token(scc_lexer_t *lexer, scc_lexer_tok_t *token) {
token->loc = lexer->pos;
token->type = SCC_TOK_UNKNOWN;
scc_probe_stream_t *stream = lexer->stream;
@@ -212,11 +186,15 @@ void scc_lexer_get_token(scc_lexer_t *lexer, lexer_tok_t *token) {
type = SCC_TOK_ASSIGN_DIV;
goto double_char;
case '/':
lex_parse_skip_line(lexer->stream, &lexer->pos);
scc_probe_stream_reset(stream);
scc_lex_parse_skip_line(stream, &lexer->pos);
scc_probe_stream_sync(stream);
token->type = SCC_TOK_LINE_COMMENT;
goto END;
case '*':
lex_parse_skip_block_comment(lexer->stream, &lexer->pos);
scc_probe_stream_reset(stream);
scc_lex_parse_skip_block_comment(stream, &lexer->pos);
scc_probe_stream_sync(stream);
token->type = SCC_TOK_BLOCK_COMMENT;
goto END;
default:
@@ -323,33 +301,17 @@ void scc_lexer_get_token(scc_lexer_t *lexer, lexer_tok_t *token) {
break;
}
break;
case '[':
type = SCC_TOK_L_BRACKET;
break;
case ']':
type = SCC_TOK_R_BRACKET;
break;
case '(':
type = SCC_TOK_L_PAREN;
break;
case ')':
type = SCC_TOK_R_PAREN;
break;
case '{':
type = SCC_TOK_L_BRACE;
break;
case '}':
type = SCC_TOK_R_BRACE;
break;
case ';':
type = SCC_TOK_SEMICOLON;
break;
case ',':
type = SCC_TOK_COMMA;
break;
case ':':
type = SCC_TOK_COLON;
break;
/* clang-format off */
case '[': type = SCC_TOK_L_BRACKET; break;
case ']': type = SCC_TOK_R_BRACKET; break;
case '(': type = SCC_TOK_L_PAREN; break;
case ')': type = SCC_TOK_R_PAREN; break;
case '{': type = SCC_TOK_L_BRACE; break;
case '}': type = SCC_TOK_R_BRACE; break;
case ';': type = SCC_TOK_SEMICOLON; break;
case ',': type = SCC_TOK_COMMA; break;
case ':': type = SCC_TOK_COLON; break;
/* clang-format on */
case '.':
if (scc_probe_stream_next(stream) == '.' &&
scc_probe_stream_next(stream) == '.') {
@@ -369,7 +331,8 @@ void scc_lexer_get_token(scc_lexer_t *lexer, lexer_tok_t *token) {
break;
case '\r':
case '\n':
lex_parse_skip_endline(lexer->stream, &lexer->pos);
scc_lex_parse_skip_endline(stream, &lexer->pos);
scc_probe_stream_sync(stream);
token->type = SCC_TOK_BLANK;
goto END;
case '#':
@@ -377,15 +340,17 @@ void scc_lexer_get_token(scc_lexer_t *lexer, lexer_tok_t *token) {
token->type = SCC_TOK_BLANK;
goto END;
case '\0':
case core_stream_eof:
case scc_stream_eof:
// EOF
type = SCC_TOK_EOF;
break;
case '\'': {
token->loc = lexer->pos;
token->type = SCC_TOK_CHAR_LITERAL;
int ch = lex_parse_char(lexer->stream, &lexer->pos);
if (ch == core_stream_eof) {
scc_probe_stream_reset(stream);
int ch = scc_lex_parse_char(stream, &lexer->pos);
scc_probe_stream_sync(stream);
if (ch == scc_stream_eof) {
LEX_ERROR("Unexpected character literal");
token->type = SCC_TOK_UNKNOWN;
} else {
@@ -396,8 +361,10 @@ void scc_lexer_get_token(scc_lexer_t *lexer, lexer_tok_t *token) {
case '"': {
token->loc = lexer->pos;
token->type = SCC_TOK_STRING_LITERAL;
scc_cstring_t output = scc_cstring_new();
if (lex_parse_string(lexer->stream, &lexer->pos, &output) == true) {
scc_cstring_t output = scc_cstring_create();
scc_probe_stream_reset(stream);
if (scc_lex_parse_string(stream, &lexer->pos, &output) == true) {
scc_probe_stream_sync(stream);
token->value.cstr.data = scc_cstring_as_cstr(&output);
token->value.cstr.len = scc_cstring_len(&output);
} else {
@@ -414,7 +381,9 @@ void scc_lexer_get_token(scc_lexer_t *lexer, lexer_tok_t *token) {
token->loc = lexer->pos;
token->type = SCC_TOK_INT_LITERAL;
usize output;
if (lex_parse_number(lexer->stream, &lexer->pos, &output) == true) {
scc_probe_stream_reset(stream);
if (scc_lex_parse_number(stream, &lexer->pos, &output) == true) {
scc_probe_stream_sync(stream);
token->value.n = output;
} else {
LEX_ERROR("Unexpected number literal");
@@ -431,8 +400,10 @@ void scc_lexer_get_token(scc_lexer_t *lexer, lexer_tok_t *token) {
case 'O': case 'P': case 'Q': case 'R': case 'S': case 'T': case 'U':
case 'V': case 'W': case 'X': case 'Y': case 'Z': case '_':
/* clang-format on */
scc_cstring_t str = scc_cstring_new();
cbool ret = lex_parse_identifier(lexer->stream, &lexer->pos, &str);
scc_cstring_t str = scc_cstring_create();
scc_probe_stream_reset(stream);
cbool ret = scc_lex_parse_identifier(stream, &lexer->pos, &str);
scc_probe_stream_sync(stream);
Assert(ret == true);
int res = keyword_cmp(scc_cstring_as_cstr(&str), scc_cstring_len(&str));
@@ -453,13 +424,13 @@ void scc_lexer_get_token(scc_lexer_t *lexer, lexer_tok_t *token) {
goto once_char;
triple_char:
scc_probe_stream_consume(stream);
core_pos_next(&lexer->pos);
scc_pos_next(&lexer->pos);
double_char:
scc_probe_stream_consume(stream);
core_pos_next(&lexer->pos);
scc_pos_next(&lexer->pos);
once_char:
scc_probe_stream_consume(stream);
core_pos_next(&lexer->pos);
scc_pos_next(&lexer->pos);
token->type = type;
END:
LEX_DEBUG("get token `%s` in %s:%d:%d", scc_get_tok_name(token->type),
@@ -467,7 +438,7 @@ END:
}
// scc_lexer_get_token maybe got invalid (with parser)
void scc_lexer_get_valid_token(scc_lexer_t *lexer, lexer_tok_t *token) {
void scc_lexer_get_valid_token(scc_lexer_t *lexer, scc_lexer_tok_t *token) {
scc_tok_subtype_t type;
do {
scc_lexer_get_token(lexer, token);

View File

@@ -1,6 +1,6 @@
#include <lexer_log.h>
logger_t __smcc_lexer_log = {
logger_t __scc_lexer_log = {
.name = "lexer",
.level = LOG_LEVEL_ALL,
.handler = log_default_handler,

View File

@@ -0,0 +1,138 @@
#include <lexer.h>
static void lexer_stream_extend(scc_lexer_stream_t *stream, usize n) {
Assert(stream != null);
// 检查是否需要扩容
if ((stream->probe_pos - stream->curr_pos + n) >= stream->toks.cap) {
// 需要扩容 - 创建新缓冲区
usize new_cap = stream->toks.cap * 2;
if (new_cap < stream->probe_pos - stream->curr_pos + n + 1) {
new_cap = stream->probe_pos - stream->curr_pos + n + 1;
}
scc_lexer_tok_t *new_data =
scc_realloc(null, new_cap * sizeof(scc_lexer_tok_t));
if (!new_data) {
LOG_FATAL("lexer_stream_extend: realloc failed\n");
}
// 将旧缓冲区中的数据拷贝到新缓冲区,保持顺序
usize data_count = stream->probe_pos - stream->curr_pos;
for (usize i = 0; i < data_count; ++i) {
usize old_idx = (stream->curr_pos + i) % stream->toks.cap;
new_data[i] = stream->toks.data[old_idx];
}
// 释放旧缓冲区
if (stream->toks.data) {
scc_free(stream->toks.data);
}
// 更新结构体
stream->toks.data = new_data;
stream->toks.cap = new_cap;
stream->curr_pos = 0;
stream->probe_pos = data_count;
}
// 填充新token
for (usize i = 0; i < n; ++i) {
usize idx = (stream->probe_pos + i) % stream->toks.cap;
if (stream->need_comment)
scc_lexer_get_token(stream->lexer, &stream->toks.data[idx]);
else
scc_lexer_get_valid_token(stream->lexer, &stream->toks.data[idx]);
}
stream->probe_pos += n;
}
static const scc_lexer_tok_t *lexer_stream_peek(scc_lexer_stream_t *stream,
usize n) {
Assert(stream != null);
// 计算需要的前看token数量
usize available = stream->probe_pos - stream->curr_pos;
if (n >= available) {
// 需要扩展缓冲区
usize need = n - available + 1;
lexer_stream_extend(stream, need);
}
// 计算实际缓冲区中的位置
usize idx = (stream->curr_pos + n) % stream->toks.cap;
return &stream->toks.data[idx];
}
static void lexer_stream_advance(scc_lexer_stream_t *stream, usize offset) {
Assert(stream != null);
if (stream->curr_pos + offset > stream->probe_pos) {
// 尝试填充更多token
usize need = stream->curr_pos + offset - stream->probe_pos;
lexer_stream_extend(stream, need);
}
stream->curr_pos += offset;
// 可选当已消费的token过多时压缩缓冲区
if (stream->curr_pos > stream->toks.cap * 3 / 4) {
// 压缩缓冲区:将有效数据移动到前面
usize data_count = stream->probe_pos - stream->curr_pos;
scc_lexer_tok_t *temp =
scc_realloc(null, data_count * sizeof(scc_lexer_tok_t));
if (!temp)
return; // 压缩失败也没关系
for (usize i = 0; i < data_count; ++i) {
usize old_idx = (stream->curr_pos + i) % stream->toks.cap;
temp[i] = stream->toks.data[old_idx];
}
scc_free(stream->toks.data);
stream->toks.data = temp;
stream->toks.cap = data_count;
stream->curr_pos = 0;
stream->probe_pos = data_count;
}
}
static void lexer_stream_drop(scc_lexer_stream_t *stream) {
Assert(stream != null);
// 清理所有token如果有需要清理的内部资源
for (usize i = 0; i < stream->toks.cap; ++i) {
// 这里假设scc_lexer_tok_t可能包含需要释放的资源
// 如果有,需要调用相应的清理函数
// 例如: if (stream->toks.data[i].needs_free)
// scc_free(stream->toks.data[i].ptr);
}
scc_vec_free(stream->toks);
stream->lexer = null;
stream->curr_pos = 0;
stream->probe_pos = 0;
stream->need_comment = false;
stream->peek = null;
stream->advance = null;
stream->drop = null;
}
void scc_lexer_to_stream(scc_lexer_t *lexer, scc_lexer_stream_t *stream,
cbool need_comment) {
Assert(lexer != null && stream != null);
stream->lexer = lexer;
stream->curr_pos = 0;
stream->probe_pos = 0;
stream->need_comment = need_comment;
// 初始化循环缓冲区
scc_vec_init(stream->toks);
scc_vec_realloc(stream->toks, 8); // 初始容量为8
stream->peek = lexer_stream_peek;
stream->advance = lexer_stream_advance;
stream->drop = lexer_stream_drop;
}

View File

@@ -7,7 +7,7 @@
static inline void test_lexer_string(const char *input,
scc_tok_type_t expected_type) {
scc_lexer_t lexer;
lexer_tok_t token;
scc_lexer_tok_t token;
scc_mem_probe_stream_t stream;
scc_lexer_init(&lexer, scc_mem_probe_stream_init(&stream, input,

View File

@@ -27,8 +27,9 @@ int main(int argc, char *argv[]) {
log_set_level(NULL, LOG_LEVEL_ALL);
} else {
// FIXME it is a hack lexer_logger
log_set_level(&__smcc_lexer_log, LOG_LEVEL_NOTSET);
log_set_level(NULL, LOG_LEVEL_INFO | LOG_LEVEL_WARN | LOG_LEVEL_ERROR);
log_set_level(&__scc_lexer_log, LOG_LEVEL_NOTSET);
log_set_level(NULL, LOG_LEVEL_INFO | LOG_LEVEL_WARN | LOG_LEVEL_ERROR |
LOG_LEVEL_FATAL);
}
const char *file_name = __FILE__;
@@ -70,11 +71,11 @@ int main(int argc, char *argv[]) {
scc_cstring_clear(&stream->name);
scc_cstring_append_cstr(&stream->name, file_name, strlen(file_name));
scc_lexer_init(&lexer, stream);
lexer_tok_t tok;
scc_lexer_tok_t tok;
while (1) {
scc_lexer_get_valid_token(&lexer, &tok);
if (tok.type == SCC_TOK_EOF) {
if (tok.type == SCC_TOK_EOF) {
break;
}
LOG_DEBUG("token `%s` at %s:%u:%u", scc_get_tok_name(tok.type),

View File

@@ -1,8 +1,8 @@
[package]
name = "smcc_pprocesser"
name = "scc_pprocesser"
dependencies = [
{ name = "libcore", path = "../../runtime/libcore" },
{ name = "libutils", path = "../../runtime/libutils" },
{ name = "smcc_lex_parser", path = "../lex_parser" },
{ name = "scc_core", path = "../../runtime/scc_core" },
{ name = "scc_utils", path = "../../runtime/scc_utils" },
{ name = "lex_parser", path = "../lex_parser" },
]

View File

@@ -0,0 +1,96 @@
#ifndef __SCC_PP_MACRO_H__
#define __SCC_PP_MACRO_H__
#include <scc_core.h>
#include <scc_utils.h>
// 宏定义类型
typedef enum {
SCC_PP_MACRO_OBJECT, // 对象宏
SCC_PP_MACRO_FUNCTION, // 函数宏
SCC_PP_MACRO_NONE, // 不是宏
} scc_pp_macro_type_t;
typedef SCC_VEC(scc_cstring_t) scc_pp_macro_list_t;
// 宏定义结构
typedef struct scc_macro {
scc_cstring_t name; // 宏名称
scc_pp_macro_type_t type; // 宏类型
scc_pp_macro_list_t replaces; // 替换列表
scc_pp_macro_list_t params; // 参数列表(仅函数宏)
} scc_pp_macro_t;
typedef struct scc_macro_table {
scc_hashtable_t table; // 宏定义表
} scc_pp_macro_table_t;
/**
* @brief 创建宏对象
* @param name 宏名称
* @param type 宏类型
* @return 创建的宏对象指针失败返回NULL
*/
scc_pp_macro_t *scc_pp_macro_new(const scc_cstring_t *name,
scc_pp_macro_type_t type);
/**
* @brief 销毁宏对象
* @param macro 要销毁的宏对象
*/
void scc_pp_macro_drop(scc_pp_macro_t *macro);
/**
* @brief 添加对象宏
* @param pp 预处理器实例
* @param name 宏名称
* @param replacement 替换文本列表
* @return 成功返回true失败返回false
*/
cbool scc_pp_add_object_macro(scc_pp_macro_table_t *pp,
const scc_cstring_t *name,
const scc_pp_macro_list_t *replacement);
/**
* @brief 添加函数宏
* @param pp 预处理器实例
* @param name 宏名称
* @param params 参数列表
* @param replacement 替换文本列表
* @return 成功返回true失败返回false
*/
cbool scc_pp_add_function_macro(scc_pp_macro_table_t *pp,
const scc_cstring_t *name,
const scc_pp_macro_list_t *params,
const scc_pp_macro_list_t *replacement);
/**
* @brief
*
* @param pp
* @param macro
* @return scc_pp_macro_t*
*/
scc_pp_macro_t *scc_pp_macro_table_set(scc_pp_macro_table_t *pp,
scc_pp_macro_t *macro);
/**
* @brief 查找宏定义
* @param pp 预处理器实例
* @param name 宏名称
* @return 找到的宏对象指针未找到返回NULL
*/
scc_pp_macro_t *scc_pp_macro_table_get(scc_pp_macro_table_t *pp,
const scc_cstring_t *name);
/**
* @brief 从预处理器中删除宏
* @param pp 预处理器实例
* @param name 宏名称
* @return 成功删除返回true未找到返回false
*/
cbool scc_pp_macro_table_remove(scc_pp_macro_table_t *pp,
const scc_cstring_t *name);
void scc_pp_marco_table_init(scc_pp_macro_table_t *macros);
void scc_pp_macro_table_drop(scc_pp_macro_table_t *macros);
#endif /* __SCC_PP_MACRO_H__ */

View File

@@ -0,0 +1,18 @@
#ifndef __SCC_PP_PARSE_H__
#define __SCC_PP_PARSE_H__
#include <pp_macro.h>
#include <scc_core.h>
void scc_pp_parse_directive(scc_probe_stream_t *stream, scc_pos_t *pos,
scc_pp_macro_table_t *macros);
cbool scc_pp_parse_macro_replace_list(scc_probe_stream_t *stream,
scc_pp_macro_list_t *list);
cbool scc_pp_parse_macro_arguments(scc_probe_stream_t *stream,
scc_pp_macro_list_t *args);
// expand
cbool scc_pp_expand_macro(scc_probe_stream_t *stream,
scc_pp_macro_table_t *macros,
scc_probe_stream_t **out_stream, int depth);
#endif /* __SCC_PP_PARSE_H__ */

View File

@@ -1,30 +1,30 @@
#ifndef __SMCC_PP_TOKEN_H__
#define __SMCC_PP_TOKEN_H__
#ifndef __SCC_PP_TOKEN_H__
#define __SCC_PP_TOKEN_H__
/* clang-format off */
/// https://cppreference.cn/w/c/preprocessor
#define PP_INST_TOKEN \
X(define , PP_STD, PP_TOK_DEFINE ) \
X(undef , PP_STD, PP_TOK_UNDEF ) \
X(include , PP_STD, PP_TOK_INCLUDE ) \
X(if , PP_STD, PP_TOK_IF ) \
X(ifdef , PP_STD, PP_TOK_IFDEF ) \
X(ifndef , PP_STD, PP_TOK_IFNDEF ) \
X(else , PP_STD, PP_TOK_ELSE ) \
X(elif , PP_STD, PP_TOK_ELIF ) \
X(elifdef , PP_STD, PP_TOK_ELIFDEF ) \
X(elifndef , PP_C23, PP_TOK_ELIFNDEF ) \
X(endif , PP_STD, PP_TOK_ENDIF ) \
X(line , PP_STD, PP_TOK_LINE ) \
X(embed , PP_C23, PP_TOK_EMBED ) \
X(error , PP_STD, PP_TOK_ERROR ) \
X(warning , PP_C23, PP_TOK_WARNING ) \
X(pragma , PP_STD, PP_TOK_PRAMA ) \
#define SCC_PP_INST_TOKEN \
X(define , SCC_PP_STD, SCC_PP_TOK_DEFINE ) \
X(elif , SCC_PP_STD, SCC_PP_TOK_ELIF ) \
X(elifdef , SCC_PP_STD, SCC_PP_TOK_ELIFDEF ) \
X(elifndef , SCC_PP_STD, SCC_PP_TOK_ELIFNDEF ) \
X(else , SCC_PP_STD, SCC_PP_TOK_ELSE ) \
X(embed , SCC_PP_STD, SCC_PP_TOK_EMBED ) \
X(endif , SCC_PP_STD, SCC_PP_TOK_ENDIF ) \
X(error , SCC_PP_STD, SCC_PP_TOK_ERROR ) \
X(if , SCC_PP_STD, SCC_PP_TOK_IF ) \
X(ifdef , SCC_PP_C23, SCC_PP_TOK_IFDEF ) \
X(ifndef , SCC_PP_STD, SCC_PP_TOK_IFNDEF ) \
X(include , SCC_PP_STD, SCC_PP_TOK_INCLUDE ) \
X(line , SCC_PP_C23, SCC_PP_TOK_LINE ) \
X(pragma , SCC_PP_STD, SCC_PP_TOK_PRAGMA ) \
X(undef , SCC_PP_C23, SCC_PP_TOK_UNDEF ) \
X(warning , SCC_PP_STD, SCC_PP_TOK_WARNING ) \
// END
/* clang-format on */
#define X(name, type, tok) tok,
typedef enum pp_token { PP_INST_TOKEN } pp_token_t;
typedef enum scc_pp_token { SCC_PP_INST_TOKEN } scc_pp_token_t;
#undef X
#endif /* __SMCC_PP_TOKEN_H__ */
#endif /* __SCC_PP_TOKEN_H__ */

View File

@@ -1,30 +1,14 @@
// pprocessor.h - 更新后的头文件
/**
* @file pprocessor.h
* @brief C语言预处理器核心数据结构与接口
*/
#ifndef __SMCC_PP_H__
#define __SMCC_PP_H__
#ifndef __SCC_PP_H__
#define __SCC_PP_H__
#include <libcore.h>
#include <libutils.h>
// 宏定义类型
typedef enum {
MACRO_OBJECT, // 对象宏
MACRO_FUNCTION, // 函数宏
} macro_type_t;
typedef VEC(cstring_t) macro_list_t;
// 宏定义结构
typedef struct smcc_macro {
cstring_t name; // 宏名称
macro_type_t type; // 宏类型
macro_list_t replaces; // 替换列表
macro_list_t params; // 参数列表(仅函数宏)
} smcc_macro_t;
#include <pp_macro.h>
#include <scc_core.h>
#include <scc_utils.h>
// 条件编译状态
typedef enum {
@@ -41,12 +25,12 @@ typedef struct if_stack_item {
} if_stack_item_t;
// 预处理器状态结构
typedef struct smcc_preprocessor {
core_stream_t *stream; // 输出流
strpool_t strpool; // 字符串池
hashmap_t macros; // 宏定义表
VEC(if_stack_item_t) if_stack; // 条件编译栈
} smcc_pp_t;
typedef struct scc_pproc {
scc_probe_stream_t *stream; // 输出流
scc_strpool_t strpool; // 字符串池
scc_pp_macro_table_t macro_table;
SCC_VEC(if_stack_item_t) if_stack; // 条件编译栈
} scc_pproc_t;
/**
* @brief 初始化预处理器
@@ -54,19 +38,25 @@ typedef struct smcc_preprocessor {
* @param[in] input 输入流对象指针
* @return output 输出流对象指针
*/
core_stream_t *pp_init(smcc_pp_t *pp, core_stream_t *input);
/**
* @brief 执行预处理
* @param[in] pp 预处理器实例
* @return 处理结果
*/
int pp_process(smcc_pp_t *pp);
// TODO 内存释放问题
scc_probe_stream_t *scc_pproc_init(scc_pproc_t *pp, scc_probe_stream_t *input);
/**
* @brief 销毁预处理器
* @param[in] pp 预处理器实例
*/
void pp_drop(smcc_pp_t *pp);
void scc_pproc_drop(scc_pproc_t *pp);
#endif /* __SMCC_PP_H__ */
/// inner private struct
typedef SCC_VEC(u8) scc_pp_buffer_t;
typedef struct pp_stream {
scc_probe_stream_t stream;
scc_probe_stream_t *input;
scc_pproc_t *self;
scc_pos_t pos;
scc_probe_stream_t *tmp_stream;
} scc_pp_stream_t;
#endif /* __SCC_PP_H__ */

View File

@@ -0,0 +1,412 @@
#include <lex_parser.h>
#include <pp_macro.h>
#include <pp_parse.h>
static inline void scc_generate_cstr(scc_cstring_t *buff) {
scc_cstring_t out_buff = scc_cstring_create();
scc_cstring_append_ch(&out_buff, '\"');
// TODO it is too simple
scc_cstring_append(&out_buff, buff);
scc_cstring_append_ch(&out_buff, '\"');
// FIXME 可能有着更好的解决方案
scc_cstring_clear(buff);
scc_cstring_append(buff, &out_buff);
scc_cstring_free(&out_buff);
}
#define SCC_PP_IS_LIST_BLANK(i) \
((i) < list->size && scc_vec_at(*list, (i)).data[0] == ' ' && \
scc_vec_at(*list, (i)).data[1] == '\0')
#define SCC_PP_IS_LIST_TO_STRING(i) \
((i) < list->size && scc_vec_at(*list, (i)).data[0] == '#' && \
scc_vec_at(*list, (i)).data[1] == '\0')
#define SCC_PP_IS_LIST_CONNECT(i) \
((i) < list->size && scc_vec_at(*list, (i)).data[0] == '#' && \
scc_vec_at(*list, (i)).data[1] == '#' && \
scc_vec_at(*list, (i)).data[2] == '\0')
#define SCC_PP_USE_CONNECT(font, rear) \
if (rear < list->size) { \
scc_cstring_append(out_buff, &scc_vec_at(*list, font)); \
scc_cstring_append(out_buff, &scc_vec_at(*list, rear)); \
} else { \
scc_cstring_append(out_buff, &scc_vec_at(*list, font)); \
}
// for # ## to generator string
static inline cbool scc_pp_expand_string_unsafe(const scc_pp_macro_list_t *list,
scc_cstring_t *out_buff) {
for (usize i = 0; i < list->size; ++i) {
if (SCC_PP_IS_LIST_BLANK(i + 1)) {
if (SCC_PP_IS_LIST_CONNECT(i + 2)) {
SCC_PP_USE_CONNECT(i, i + 3);
i += 3;
continue;
}
} else if (SCC_PP_IS_LIST_CONNECT(i + 1)) {
SCC_PP_USE_CONNECT(i, i + 2);
i += 2;
continue;
} else if (SCC_PP_IS_LIST_TO_STRING(i)) {
i += 1;
if (i < list->size) {
scc_generate_cstr(&scc_vec_at(*list, i));
} else {
LOG_ERROR("# need a valid literator");
break;
}
}
scc_cstring_append(out_buff, &scc_vec_at(*list, i));
}
return true;
}
// 展开对象宏
cbool scc_pp_expand_object_macro(const scc_pp_macro_t *macro,
scc_cstring_t *out_buff) {
Assert(macro->type == SCC_PP_MACRO_OBJECT && macro->params.size == 0);
Assert(scc_cstring_is_empty(out_buff) == true);
// 对象宏输出替换文本并进行递归展开
scc_pp_expand_string_unsafe(&macro->replaces, out_buff);
return true;
}
// 展开函数宏
cbool scc_pp_expand_function_macro(const scc_pp_macro_t *macro,
const scc_pp_macro_list_t *origin_params,
scc_pp_macro_list_t *params,
scc_cstring_t *out_buff) {
Assert(macro->type == SCC_PP_MACRO_FUNCTION);
Assert(out_buff != null);
Assert(scc_cstring_is_empty(out_buff) == true);
Assert(params->size == macro->params.size || params->size == 0);
scc_pp_macro_list_t list;
scc_vec_init(list);
for (usize i = 0; i < macro->replaces.size; ++i) {
// TODO ... __VA_ARGS__
for (usize j = 0; j < macro->params.size; ++j) {
if (scc_strcmp(
scc_cstring_as_cstr(&scc_vec_at(macro->replaces, i)),
scc_cstring_as_cstr(&scc_vec_at(macro->params, j))) != 0) {
continue;
}
if (params->size == 0) {
scc_vec_push(list, scc_cstring_from_cstr(""));
goto END;
}
Assert(&scc_vec_at(*params, j) != null);
scc_vec_push(list, scc_cstring_copy(&scc_vec_at(*params, j)));
goto END;
}
scc_vec_push(list, scc_cstring_copy(&scc_vec_at(macro->replaces, i)));
END:;
}
scc_pp_expand_string_unsafe(&list, out_buff);
for (usize i = 0; i < list.size; ++i) {
scc_cstring_free(&scc_vec_at(list, i));
}
scc_vec_free(list);
return true;
}
// 状态管理结构
typedef struct {
scc_pp_macro_table_t *macros;
scc_pp_macro_table_t painted_blue; // 正在展开的宏
int depth;
} scc_expansion_ctx_t;
// 进入宏展开
static void enter_macro_expansion(scc_expansion_ctx_t *state,
scc_pp_macro_t *macro) {
// 添加到活动宏集合
scc_pp_macro_table_set(&state->painted_blue,
scc_pp_macro_new(&macro->name, macro->type));
}
// 离开宏展开(开始重新扫描)
static void leave_macro_expansion(scc_expansion_ctx_t *state,
scc_pp_macro_t *macro) {
// 从活动宏移除,添加到禁用宏
scc_pp_macro_table_remove(&state->painted_blue, &macro->name);
}
// 检查是否可以展开
static cbool can_expand_macro(scc_expansion_ctx_t *state,
scc_pp_macro_t *macro) {
return scc_pp_macro_table_get(&state->painted_blue, &macro->name) == null;
}
typedef struct {
scc_cstring_t identifier;
scc_pp_macro_list_t args;
} scc_expand_unit_t;
void scc_pp_parse_expand_macro(scc_probe_stream_t *stream, scc_pos_t *pos,
scc_expand_unit_t *unit) {
Assert(stream != null && pos != null && unit != null);
// TODO Assert empty unit
scc_cstring_init(&unit->identifier);
scc_vec_init(unit->args);
cbool ret = scc_lex_parse_identifier(stream, pos, &unit->identifier);
Assert(ret == true);
scc_probe_stream_sync(stream);
do {
if (scc_lex_parse_is_whitespace(scc_probe_stream_peek(stream))) {
scc_lex_parse_skip_whitespace(stream, pos);
}
if (scc_probe_stream_peek(stream) != '(') {
// TODO maybe error
// 因为没能正确表述空值
break;
}
// TODO maybe like f()() => maybe twice expand
ret = scc_pp_parse_macro_arguments(stream, &unit->args);
scc_probe_stream_sync(stream);
Assert(ret == true);
} while (0);
}
static cbool expand_buffer(const scc_cstring_t *in, scc_cstring_t *out,
scc_expansion_ctx_t *state);
static inline void args2str_append(const scc_pp_macro_list_t *args,
scc_cstring_t *out) {
Assert(args != null);
if (args->size != 0) {
scc_cstring_append_ch(out, '(');
for (usize i = 0; i < args->size; ++i) {
scc_cstring_append(out, &scc_vec_at(*args, i));
if (i != args->size - 1) {
scc_cstring_append_ch(out, ',');
scc_cstring_append_ch(out, ' ');
}
}
scc_cstring_append_ch(out, ')');
}
}
static cbool expand_macro(const scc_expand_unit_t *unit, scc_cstring_t *out,
scc_expansion_ctx_t *state) {
scc_pp_macro_t *macro =
scc_pp_macro_table_get(state->macros, &unit->identifier);
if (macro == null) {
scc_cstring_append(out, &unit->identifier);
args2str_append(&unit->args, out);
return true;
}
cbool ret;
if (macro->type == SCC_PP_MACRO_OBJECT) {
if (can_expand_macro(state, macro)) {
ret = scc_pp_expand_object_macro(macro, out);
Assert(ret == true);
args2str_append(&unit->args, out);
} else {
scc_cstring_append(out, &macro->name);
args2str_append(&unit->args, out);
return true;
}
} else if (macro->type == SCC_PP_MACRO_FUNCTION) {
/**
* 1. 参数先展开
* 2. 替换后重扫描
* 3. 蓝色集合中不展开
* 4. #, ## 不展开
* 5. 最后的括号要检查
*/
scc_pp_macro_list_t expanded_params;
scc_vec_init(expanded_params);
// expand params fisrt and recursive
for (usize i = 0; i < unit->args.size; ++i) {
scc_cstring_t param = scc_vec_at(unit->args, i);
scc_cstring_t out = scc_cstring_create();
expand_buffer(&param, &out, state);
scc_vec_push(expanded_params, out);
}
if (can_expand_macro(state, macro)) {
ret = scc_pp_expand_function_macro(macro, &unit->args,
&expanded_params, out);
Assert(ret == true);
} else {
scc_cstring_append(out, &macro->name);
args2str_append(&expanded_params, out);
return true;
}
// TODO memory leak
} else {
UNREACHABLE();
}
// 重新扫描展开结果
scc_cstring_t rescanned = scc_cstring_create();
enter_macro_expansion(state, macro);
expand_buffer(out, &rescanned, state);
leave_macro_expansion(state, macro);
scc_cstring_free(out);
*out = rescanned;
return true;
}
static cbool expand_buffer(const scc_cstring_t *in, scc_cstring_t *out,
scc_expansion_ctx_t *state) {
scc_probe_stream_t *in_stream = scc_mem_probe_stream_alloc(
scc_cstring_as_cstr(in), scc_cstring_len(in), false);
int ch;
scc_pos_t pos;
scc_expand_unit_t unit;
while ((ch = scc_probe_stream_peek(in_stream)) != scc_stream_eof) {
if (scc_lex_parse_is_identifier_prefix(ch)) {
// 递归检查
scc_cstring_t expanded_buffer = scc_cstring_create();
scc_pp_parse_expand_macro(in_stream, &pos, &unit);
scc_probe_stream_reset(in_stream);
if (expand_macro(&unit, &expanded_buffer, state) == false) {
scc_cstring_free(&expanded_buffer);
return false;
} else {
scc_cstring_append(out, &expanded_buffer);
scc_cstring_free(&expanded_buffer);
}
} else {
scc_cstring_append_ch(out, scc_probe_stream_consume(in_stream));
}
}
scc_probe_stream_drop(in_stream);
return true;
}
// static cbool scc_pp_expand_macro_impl(scc_probe_stream_t *stream,
// scc_expansion_ctx_t *ctx, scc_pos_t
// *pos, scc_cstring_t *output) {
// // TODO self position and it maybe is a stack on #include ?
// // 递归扫描
// if (ctx->depth-- <= 0) {
// scc_cstring_free(output);
// return false;
// }
// cbool ret;
// scc_pp_macro_table_t *macros = ctx->macros;
// scc_cstring_t identifier = scc_cstring_create();
// ret = scc_lex_parse_identifier(stream, pos, &identifier);
// Assert(ret == true);
// scc_pp_macro_t *macro = scc_pp_macro_table_get(macros, &identifier);
// // 1. 不是宏,直接输出标识符
// if (macro == null) {
// // 不是宏,直接输出
// usize length = scc_cstring_len(&identifier);
// *out_stream = scc_mem_probe_stream_alloc(
// scc_cstring_move_cstr(&identifier), length, true);
// return true;
// } else {
// scc_cstring_free(&identifier);
// }
// // 收集参数(如果是函数宏)
// scc_pp_macro_list_t params;
// scc_pp_macro_list_t expanded_params;
// scc_vec_init(params);
// scc_vec_init(expanded_params);
// if (macro->type == SCC_PP_MACRO_FUNCTION) {
// // TODO when expand need check another func with () at the end
// scc_lex_parse_skip_whitespace(stream, &pos);
// if (scc_probe_stream_peek(stream) != '(') {
// goto ORIGIN;
// }
// ret = scc_pp_parse_macro_arguments(stream, &params);
// Assert(ret == true);
// // expand params fisrt and recursive
// for (usize i = 0; i < params.size; ++i) {
// scc_cstring_t param = scc_vec_at(params, i);
// scc_cstring_t out = scc_cstring_create();
// expand_buffer(&param, &out, ctx);
// scc_vec_push(expanded_params, out);
// }
// }
// // 2. 检查到重复展开跳过
// // 检查是否可以展开
// if (!can_expand_macro(ctx, macro)) {
// ORIGIN:
// // 输出原始调用
// scc_cstring_t original = scc_cstring_create();
// scc_cstring_append(&original, &macro->name);
// if (macro->type == SCC_PP_MACRO_FUNCTION && expanded_params.size !=
// 0) {
// scc_cstring_append_ch(&original, '(');
// for (usize i = 0; i < expanded_params.size; ++i) {
// scc_cstring_append(&original, &scc_vec_at(expanded_params,
// i)); if (i != expanded_params.size - 1) {
// scc_cstring_append_ch(&original, ',');
// scc_cstring_append_ch(&original, ' ');
// }
// }
// scc_cstring_append_ch(&original, ')');
// }
// *out_stream = scc_mem_probe_stream_alloc(
// scc_cstring_as_cstr(&original), scc_cstring_len(&original),
// true);
// scc_vec_free(params);
// scc_vec_free(expanded_params);
// return true;
// }
// // 开始展开
// scc_cstring_t expanded = scc_cstring_create();
// if (macro->type == SCC_PP_MACRO_OBJECT) {
// ret = scc_pp_expand_object_macro(macro, &expanded);
// Assert(ret == true);
// } else if (macro->type == SCC_PP_MACRO_FUNCTION) {
// ret = scc_pp_expand_function_macro(macro, &params, &expanded_params,
// &expanded);
// Assert(ret == true);
// } else {
// UNREACHABLE();
// }
// // 重新扫描展开结果
// // 将展开内容变换成stream并递归展开
// scc_cstring_t rescanned = scc_cstring_create();
// enter_macro_expansion(ctx, macro);
// expand_buffer(&expanded, &rescanned, ctx);
// leave_macro_expansion(ctx, macro);
// scc_cstring_free(&expanded);
// // TODO memory leak
// *out_stream = scc_mem_probe_stream_alloc(scc_cstring_as_cstr(&rescanned),
// scc_cstring_len(&rescanned),
// true);
// return true;
// }
cbool scc_pp_expand_macro(scc_probe_stream_t *stream,
scc_pp_macro_table_t *macros,
scc_probe_stream_t **out_stream, int depth) {
Assert(depth > 0 && stream != null && macros != null && out_stream != null);
scc_expansion_ctx_t state;
state.depth = depth;
scc_pp_marco_table_init(&state.painted_blue);
state.macros = macros;
scc_cstring_t buffer = scc_cstring_create();
scc_pos_t pos;
// cbool ret = scc_pp_expand_macro_impl(stream, &state, &pos, &buffer);
// expand_buffer(stream, &buffer, &state);
scc_expand_unit_t unit;
scc_pp_parse_expand_macro(stream, &pos, &unit);
scc_probe_stream_reset(stream);
expand_macro(&unit, &buffer, &state);
*out_stream = scc_mem_probe_stream_alloc(scc_cstring_as_cstr(&buffer),
scc_cstring_len(&buffer), true);
scc_pp_macro_table_drop(&state.painted_blue);
return true;
}

155
libs/pprocessor/src/macro.c Normal file
View File

@@ -0,0 +1,155 @@
#include <pp_macro.h>
// 创建宏对象
scc_pp_macro_t *scc_pp_macro_new(const scc_cstring_t *name,
scc_pp_macro_type_t type) {
scc_pp_macro_t *macro = scc_malloc(sizeof(scc_pp_macro_t));
if (!macro) {
LOG_ERROR("Failed to allocate memory for macro");
return null;
}
macro->name = scc_cstring_copy(name);
macro->type = type;
scc_vec_init(macro->params);
scc_vec_init(macro->replaces);
return macro;
}
// 销毁宏对象
void scc_pp_macro_drop(scc_pp_macro_t *macro) {
if (!macro)
return;
scc_cstring_free(&macro->name);
// 释放参数列表
for (usize i = 0; i < macro->params.size; ++i) {
scc_cstring_free(&scc_vec_at(macro->params, i));
}
scc_vec_free(macro->params);
// 释放替换列表
for (usize i = 0; i < macro->replaces.size; ++i) {
scc_cstring_free(&scc_vec_at(macro->replaces, i));
}
scc_vec_free(macro->replaces);
scc_free(macro);
}
// 添加对象宏
cbool scc_pp_add_object_macro(scc_pp_macro_table_t *macros,
const scc_cstring_t *name,
const scc_pp_macro_list_t *replacement) {
if (!macros || !name || !replacement)
return false;
scc_pp_macro_t *macro = scc_pp_macro_new(name, SCC_PP_MACRO_OBJECT);
if (!macro)
return false;
macro->replaces = *replacement;
// 检查是否已存在同名宏
scc_pp_macro_t *existing = scc_hashtable_get(&macros->table, &macro->name);
if (existing) {
LOG_WARN("Redefining macro: %s", scc_cstring_as_cstr(&macro->name));
scc_pp_macro_drop(existing);
}
scc_hashtable_set(&macros->table, &macro->name, macro);
return true;
}
// 添加函数宏
cbool scc_pp_add_function_macro(scc_pp_macro_table_t *macros,
const scc_cstring_t *name,
const scc_pp_macro_list_t *params,
const scc_pp_macro_list_t *replacement) {
if (!macros || !name || !params || !replacement)
return false;
scc_pp_macro_t *macro = scc_pp_macro_new(name, SCC_PP_MACRO_FUNCTION);
if (!macro)
return false;
// 复制参数列表
macro->params = *params;
macro->replaces = *replacement;
// 检查是否已存在同名宏
scc_pp_macro_t *existing = scc_hashtable_get(&macros->table, &macro->name);
if (existing) {
LOG_WARN("Redefining macro: %s", scc_cstring_as_cstr(&macro->name));
scc_pp_macro_drop(existing);
}
scc_hashtable_set(&macros->table, &macro->name, macro);
return true;
}
/// marco_table
scc_pp_macro_t *scc_pp_macro_table_set(scc_pp_macro_table_t *pp,
scc_pp_macro_t *macro) {
Assert(pp != null && macro != null);
return scc_hashtable_set(&pp->table, &macro->name, macro);
}
// 查找宏定义
scc_pp_macro_t *scc_pp_macro_table_get(scc_pp_macro_table_t *pp,
const scc_cstring_t *name) {
return scc_hashtable_get(&pp->table, name);
}
// 从预处理器中删除宏
cbool scc_pp_macro_table_remove(scc_pp_macro_table_t *pp,
const scc_cstring_t *name) {
if (!pp || !name)
return false;
scc_pp_macro_t *macro = scc_hashtable_get(&pp->table, name);
if (!macro)
return false;
scc_hashtable_del(&pp->table, name);
scc_pp_macro_drop(macro);
return true;
}
static u32 hash_func(const void *key) {
const scc_cstring_t *string = (const scc_cstring_t *)key;
return scc_strhash32(scc_cstring_as_cstr(string));
}
static int hash_cmp(const void *key1, const void *key2) {
const scc_cstring_t *str1 = (const scc_cstring_t *)key1;
const scc_cstring_t *str2 = (const scc_cstring_t *)key2;
if (str1->size != str2->size) {
return str1->size - str2->size;
}
return scc_strcmp(scc_cstring_as_cstr(str1), scc_cstring_as_cstr(str2));
}
void scc_pp_marco_table_init(scc_pp_macro_table_t *macros) {
Assert(macros != null);
macros->table.hash_func = hash_func;
macros->table.key_cmp = hash_cmp;
scc_hashtable_init(&macros->table);
}
static int macro_free(const void *key, void *value, void *context) {
(void)key;
(void)context;
scc_pp_macro_drop(value);
return 0;
}
void scc_pp_macro_table_drop(scc_pp_macro_table_t *macros) {
Assert(macros != null);
scc_hashtable_foreach(&macros->table, macro_free, null);
scc_hashtable_drop(&macros->table);
}

295
libs/pprocessor/src/parse.c Normal file
View File

@@ -0,0 +1,295 @@
#include <lex_parser.h>
#include <pp_macro.h>
#include <pp_parse.h>
#include <pp_token.h>
static const struct {
const char *name;
scc_pp_token_t tok;
} keywords[] = {
#define X(name, type, tok) {#name, tok},
SCC_PP_INST_TOKEN
#undef X
};
// 使用二分查找查找关键字
static inline int keyword_cmp(const char *name, int len) {
int low = 0;
int high = sizeof(keywords) / sizeof(keywords[0]) - 1;
while (low <= high) {
int mid = (low + high) / 2;
const char *key = keywords[mid].name;
int cmp = 0;
// 自定义字符串比较逻辑
for (int i = 0; i < len; i++) {
if (name[i] != key[i]) {
cmp = (unsigned char)name[i] - (unsigned char)key[i];
break;
}
if (name[i] == '\0')
break; // 遇到终止符提前结束
}
if (cmp == 0) {
// 完全匹配检查(长度相同)
if (key[len] == '\0')
return mid;
cmp = -1; // 当前关键词比输入长
}
if (cmp < 0) {
high = mid - 1;
} else {
low = mid + 1;
}
}
return -1; // Not a keyword.
}
static inline void try_to_cut_list(scc_pp_macro_list_t *list,
scc_cstring_t *buff) {
if (scc_cstring_len(buff) != 0) {
scc_vec_push(*list, *buff);
*buff = scc_cstring_create();
}
}
cbool scc_pp_parse_macro_replace_list(scc_probe_stream_t *stream,
scc_pp_macro_list_t *list) {
Assert(stream != null && list != null);
// scc_probe_stream_reset(stream);
scc_vec_init(*list);
scc_cstring_t replacement = scc_cstring_create();
int ch;
scc_pos_t pos = scc_pos_create();
while ((ch = scc_probe_stream_peek(stream)) != scc_stream_eof) {
if (scc_lex_parse_is_endline(ch)) {
break;
}
if (scc_lex_parse_is_identifier_prefix(ch)) {
try_to_cut_list(list, &replacement);
cbool ret = scc_lex_parse_identifier(stream, &pos, &replacement);
Assert(ret == true);
try_to_cut_list(list, &replacement);
} else if (ch == '#') {
// 处理 # 和 ## 操作符
scc_probe_stream_next(stream);
try_to_cut_list(list, &replacement);
scc_cstring_append_ch(&replacement, '#');
if (scc_probe_stream_peek(stream) == '#') {
// ## 连接操作符
scc_probe_stream_next(stream);
scc_cstring_append_ch(&replacement, '#');
}
// 我需要尽可能防止空白字符干扰解析
scc_lex_parse_skip_whitespace(stream, &pos);
try_to_cut_list(list, &replacement);
} else if (scc_lex_parse_is_whitespace(ch)) {
try_to_cut_list(list, &replacement);
scc_lex_parse_skip_whitespace(stream, &pos);
scc_cstring_append_ch(&replacement, ' ');
try_to_cut_list(list, &replacement);
} else {
scc_probe_stream_next(stream);
scc_cstring_append_ch(&replacement, (char)ch);
}
}
if (scc_cstring_len(&replacement) != 0) {
scc_vec_push(*list, replacement);
replacement = scc_cstring_create();
}
// for (usize i = 0; i < list->size; ++i) {
// LOG_DEBUG("list %d: %s", (int)i,
// scc_cstring_as_cstr(&scc_vec_at(*list, i)));
// }
return true;
}
// 解析宏参数列表
cbool scc_pp_parse_macro_arguments(scc_probe_stream_t *stream,
scc_pp_macro_list_t *args) {
Assert(stream != null && args != null);
scc_vec_init(*args);
int ch;
// scc_probe_stream_reset(stream);
// 跳过 '('
ch = scc_probe_stream_peek(stream);
if (ch != '(') {
return false;
}
scc_probe_stream_next(stream); // 消费 '('
int paren_depth = 1;
scc_cstring_t current_arg = scc_cstring_create();
scc_pos_t pos = scc_pos_create();
while (paren_depth > 0) {
ch = scc_probe_stream_peek(stream);
if (ch == scc_stream_eof) {
scc_cstring_free(&current_arg);
scc_cstring_free(&pos.name);
return false;
}
if (ch == '(') {
paren_depth++;
scc_cstring_append_ch(&current_arg, (char)ch);
scc_probe_stream_next(stream);
} else if (ch == ')') {
paren_depth--;
if (paren_depth > 0) {
scc_cstring_append_ch(&current_arg, (char)ch);
}
scc_probe_stream_next(stream);
} else if (ch == ',' && paren_depth == 1) {
// 参数分隔符
scc_vec_push(*args, current_arg);
current_arg = scc_cstring_create();
scc_probe_stream_next(stream);
// 跳过参数后的空白
scc_lex_parse_skip_whitespace(stream, &pos);
} else {
scc_cstring_append_ch(&current_arg, (char)ch);
scc_probe_stream_next(stream);
}
}
// 添加最后一个参数
if (!scc_cstring_is_empty(&current_arg)) {
scc_vec_push(*args, current_arg);
} else {
scc_cstring_free(&current_arg);
}
scc_cstring_free(&pos.name);
return true;
}
static cbool safe_skip_backspace_if_endline(scc_probe_stream_t *stream,
scc_pos_t *pos) {
// scc_probe_stream_reset(stream);
int ch = scc_probe_stream_peek(stream);
// FIXME maybe it not correct
while (ch == '\r' || ch == '\n' || ch == ' ' || ch == '\t') {
if (scc_lex_parse_is_endline(ch)) {
scc_lex_parse_skip_endline(stream, pos);
return true;
}
scc_probe_stream_next(stream);
ch = scc_probe_stream_peek(stream);
}
// scc_probe_stream_reset(stream);
return false;
}
void scc_pp_parse_directive(scc_probe_stream_t *stream, scc_pos_t *pos,
scc_pp_macro_table_t *macros) {
Assert(stream != null);
// scc_probe_stream_reset(stream);
// 跳过 '#' 和后续空白
if (scc_probe_stream_peek(stream) != '#') {
LOG_WARN("Invalid directive");
return;
}
scc_pos_next(pos);
scc_probe_stream_next(stream);
if (safe_skip_backspace_if_endline(stream, pos))
return;
// 解析指令名称
scc_cstring_t directive = scc_cstring_create();
if (!scc_lex_parse_identifier(stream, pos, &directive)) {
goto ERR;
}
if (safe_skip_backspace_if_endline(stream, pos))
goto FREE;
scc_pp_token_t token = keyword_cmp(scc_cstring_as_cstr(&directive),
scc_cstring_len(&directive));
scc_cstring_t name = scc_cstring_create();
switch (token) {
case SCC_PP_TOK_DEFINE: {
if (!scc_lex_parse_identifier(stream, pos, &name)) {
goto ERR;
}
// 检查是否是函数宏:宏名后是否直接跟着 '('(没有空白字符)
// scc_probe_stream_reset(stream);
int ch = scc_probe_stream_peek(stream);
cbool has_whitespace = scc_lex_parse_is_whitespace(ch);
if (has_whitespace && safe_skip_backspace_if_endline(stream, pos)) {
goto FREE;
}
if (!has_whitespace && ch == '(') {
// 函数宏
scc_pp_macro_list_t params;
if (!scc_pp_parse_macro_arguments(stream, &params)) {
goto ERR;
}
ch = scc_probe_stream_peek(stream);
if (ch == ')') {
scc_probe_stream_next(stream); // 消费 ')'
}
if (safe_skip_backspace_if_endline(stream, pos)) {
goto FREE;
}
scc_pp_macro_list_t replacement;
scc_pp_parse_macro_replace_list(stream, &replacement);
scc_pp_add_function_macro(macros, &name, &params, &replacement);
} else {
// 对象宏
scc_pp_macro_list_t replacement;
scc_pp_parse_macro_replace_list(stream, &replacement);
scc_pp_add_object_macro(macros, &name, &replacement);
}
break;
}
case SCC_PP_TOK_UNDEF: {
if (scc_lex_parse_identifier(stream, pos, &name)) {
cbool ret = scc_pp_macro_table_remove(macros, &name);
Assert(ret == true);
}
break;
}
case SCC_PP_TOK_INCLUDE:
case SCC_PP_TOK_IF:
case SCC_PP_TOK_IFDEF:
case SCC_PP_TOK_IFNDEF:
case SCC_PP_TOK_ELSE:
case SCC_PP_TOK_ELIF:
case SCC_PP_TOK_ELIFDEF:
case SCC_PP_TOK_ELIFNDEF:
case SCC_PP_TOK_ENDIF:
case SCC_PP_TOK_LINE:
case SCC_PP_TOK_EMBED:
case SCC_PP_TOK_ERROR:
case SCC_PP_TOK_WARNING:
case SCC_PP_TOK_PRAGMA:
// 暂时跳过这一行
scc_lex_parse_skip_line(stream, pos);
TODO();
break;
default:
LOG_WARN("Unknown preprocessor directive: %s",
scc_cstring_as_cstr(&directive));
scc_lex_parse_skip_line(stream, pos);
}
ERR:
scc_lex_parse_skip_line(stream, pos);
FREE:
scc_cstring_free(&directive);
scc_cstring_free(&name);
}

View File

@@ -4,424 +4,129 @@
*/
#include <lex_parser.h>
#include <pp_macro.h>
#include <pp_parse.h>
#include <pp_token.h>
#include <pprocessor.h>
#define PPROCESSER_BUFFER_SIZE (1024)
static u32 hash_func(cstring_t *string) {
return smcc_strhash32(cstring_as_cstr(string));
}
#ifdef TEST_MODE
#define MAX_MACRO_EXPANSION_DEPTH 16
#else
#define MAX_MACRO_EXPANSION_DEPTH 64 // 防止无限递归的最大展开深度
#endif
static int hash_cmp(const cstring_t *str1, const cstring_t *str2) {
if (str1->size != str2->size) {
return str1->size - str2->size;
}
return smcc_strcmp(cstring_as_cstr(str1), cstring_as_cstr(str2));
}
// 添加宏定义
static void add_macro(smcc_pp_t *pp, const cstring_t *name,
const macro_list_t *replaces, const macro_list_t *params,
macro_type_t type) {
smcc_macro_t *macro = smcc_malloc(sizeof(smcc_macro_t));
macro->name = *name;
macro->type = type;
if (replaces) {
macro->replaces = *replaces;
} else {
vec_init(macro->replaces);
}
if (params) {
macro->params = *params;
} else {
vec_init(macro->params);
}
hashmap_set(&pp->macros, &macro->name, macro);
}
// 查找宏定义
static smcc_macro_t *find_macro(smcc_pp_t *pp, cstring_t *name) {
return hashmap_get(&pp->macros, name);
}
// 条件编译处理框架
static void handle_if(smcc_pp_t *pp, const char *condition) {
if_stack_item_t item;
int cond_value;
// cond_value = evaluate_condition(pp, condition);
item.state = cond_value ? IFState_TRUE : IFState_FALSE;
item.skip = !cond_value;
vec_push(pp->if_stack, item);
}
static void handle_else(smcc_pp_t *pp) {
if (pp->if_stack.size == 0) {
// 错误:没有匹配的#if
return;
}
if_stack_item_t *top = &vec_at(pp->if_stack, pp->if_stack.size - 1);
if (top->state == IFState_ELSE) {
// 错误:#else重复出现
return;
}
top->skip = !top->skip;
top->state = IFState_ELSE;
}
static void handle_include(smcc_pp_t *pp, const char *filename,
int system_header) {
// 查找文件路径逻辑
// 创建新的输入流
// 递归处理包含文件
}
// 解析标识符
static cstring_t parse_identifier(core_stream_t *stream) {
cstring_t identifier = cstring_new();
core_stream_reset_char(stream);
int ch = core_stream_peek_char(stream);
// 标识符以字母或下划线开头
if (!((ch >= 'a' && ch <= 'z') || (ch >= 'A' && ch <= 'Z') || ch == '_')) {
LOG_WARN("Invalid identifier");
return identifier;
}
do {
cstring_push(&identifier, (char)ch);
core_stream_next_char(stream); // 消费字符
ch = core_stream_peek_char(stream);
} while ((ch >= 'a' && ch <= 'z') || (ch >= 'A' && ch <= 'Z') ||
(ch >= '0' && ch <= '9') || ch == '_');
return identifier;
}
// 跳过空白字符 ' ' and '\t'
static void skip_whitespace(core_stream_t *stream) {
int ch;
core_stream_reset_char(stream);
while ((ch = core_stream_peek_char(stream)) != core_stream_eof) {
if (ch == ' ' || ch == '\t') {
core_stream_next_char(stream);
} else {
break;
}
}
}
#define X(name, type, tok) SMCC_STR(name),
static const char *token_strings[] = {PP_INST_TOKEN};
#undef X
static const struct {
const char *name;
pp_token_t tok;
} keywords[] = {
#define X(name, type, tok) {#name, tok},
PP_INST_TOKEN
#undef X
};
// by using binary search to find the keyword
static inline int keyword_cmp(const char *name, int len) {
int low = 0;
int high = sizeof(keywords) / sizeof(keywords[0]) - 1;
while (low <= high) {
int mid = (low + high) / 2;
const char *key = keywords[mid].name;
int cmp = 0;
// 自定义字符串比较逻辑
for (int i = 0; i < len; i++) {
if (name[i] != key[i]) {
cmp = (unsigned char)name[i] - (unsigned char)key[i];
break;
}
if (name[i] == '\0')
break; // 遇到终止符提前结束
}
if (cmp == 0) {
// 完全匹配检查(长度相同)
if (key[len] == '\0')
return mid;
cmp = -1; // 当前关键词比输入长
}
if (cmp < 0) {
high = mid - 1;
} else {
low = mid + 1;
}
}
return -1; // Not a keyword.
}
typedef struct pp_stream {
core_stream_t stream;
core_stream_t *input;
smcc_pp_t *self;
usize size;
usize pos;
char buffer[PPROCESSER_BUFFER_SIZE];
} pp_stream_t;
static cbool parse_list(pp_stream_t *_stream, macro_list_t *list,
cbool is_param) {
Assert(_stream != null);
core_stream_t *stream = _stream->input;
static int pp_stream_read_char(scc_probe_stream_t *_stream) {
scc_pp_stream_t *stream = (scc_pp_stream_t *)_stream;
Assert(stream != null);
core_stream_reset_char(stream);
vec_init(*list);
int ch;
cstring_t str = cstring_new();
core_pos_t pos;
while ((ch = core_stream_peek_char(stream)) != core_stream_eof) {
if (is_param) {
// ( 参数 ) ( 参数, ... ) ( ... )
if (lex_parse_is_whitespace(ch)) {
// TODO #define ( A A , B ) need ERROR
lex_parse_skip_whitespace(stream, &pos);
core_stream_reset_char(stream);
} else if (ch == ',') {
vec_push(*list, str);
str = cstring_new();
core_stream_next_char(stream);
continue;
} else if (ch == ')') {
break;
} else if (ch == core_stream_eof || lex_parse_is_endline(ch)) {
LOG_ERROR("Invalid parameter list");
return false;
}
} else {
// 替换列表
if (lex_parse_is_whitespace(ch)) {
lex_parse_skip_whitespace(stream, &pos);
vec_push(*list, str);
str = cstring_new();
core_stream_reset_char(stream);
continue;
} else if (lex_parse_is_endline(ch)) {
break;
}
}
core_stream_next_char(stream);
cstring_push(&str, (char)ch);
}
vec_push(*list, str);
str = cstring_new();
return true;
}
// 解析预处理指令
static void parse_directive(pp_stream_t *_stream) {
Assert(_stream != null);
core_stream_t *stream = _stream->input;
Assert(stream != null);
int ch;
core_pos_t pos;
core_stream_reset_char(stream);
// 跳过 '#' 和后续空白
if (core_stream_peek_char(stream) != '#') {
LOG_WARN("Invalid directive");
return;
}
core_stream_next_char(stream);
// TODO 允许空指令(# 后跟换行符),且无任何效果。
skip_whitespace(stream);
// 解析指令名称
cstring_t directive = parse_identifier(stream);
if (cstring_is_empty(&directive)) {
LOG_ERROR("expected indentifier");
goto ERR;
}
skip_whitespace(stream);
core_stream_reset_char(stream);
pp_token_t token =
keyword_cmp(cstring_as_cstr(&directive), cstring_len(&directive));
switch (token) {
case PP_TOK_DEFINE: {
cstring_t name = parse_identifier(stream);
if (cstring_is_empty(&name)) {
LOG_ERROR("expected indentifier");
goto ERR;
}
skip_whitespace(stream);
core_stream_reset_char(stream);
int ch = core_stream_peek_char(stream);
if (ch == '(') {
macro_list_t params;
parse_list(_stream, &params, true);
ch = core_stream_next_char(stream);
if (ch != ')') {
}
goto ERR;
}
macro_list_t replacement;
parse_list(_stream, &replacement, false);
add_macro(_stream->self, &name, &replacement, NULL, MACRO_OBJECT);
break;
}
case PP_TOK_UNDEF:
case PP_TOK_INCLUDE:
case PP_TOK_IF:
case PP_TOK_IFDEF:
case PP_TOK_IFNDEF:
case PP_TOK_ELSE:
case PP_TOK_ELIF:
case PP_TOK_ELIFDEF:
case PP_TOK_ELIFNDEF:
case PP_TOK_ENDIF:
case PP_TOK_LINE:
case PP_TOK_EMBED:
case PP_TOK_ERROR:
case PP_TOK_WARNING:
case PP_TOK_PRAMA:
TODO();
break;
default:
LOG_WARN("Unknown preprocessor directive: %s",
cstring_as_cstr(&directive));
}
// TODO: win \r\n linux \n mac \r => all need transport to \n
core_stream_reset_char(stream);
lex_parse_skip_line(stream, &pos);
cstring_free(&directive);
return;
ERR:
// TODO skip line
LOG_FATAL("Unhandled preprocessor directive");
}
static inline void stream_push_string(pp_stream_t *stream, cstring_t *str) {
stream->size += cstring_len(str);
Assert(stream->size <= PPROCESSER_BUFFER_SIZE);
smcc_memcpy(stream->buffer, cstring_as_cstr(str), stream->size);
}
static inline void stream_push_char(pp_stream_t *stream, int ch) {
stream->buffer[stream->size++] = ch;
Assert(stream->size <= PPROCESSER_BUFFER_SIZE);
}
static int next_char(core_stream_t *_stream) {
pp_stream_t *stream = (pp_stream_t *)_stream;
Assert(stream != null);
READ_BUF:
if (stream->size != 0) {
if (stream->pos < stream->size) {
return stream->buffer[stream->pos++];
} else {
stream->size = 0;
stream->pos = 0;
}
if (stream->tmp_stream != null &&
(ch = scc_probe_stream_consume(stream->tmp_stream)) != scc_stream_eof) {
return ch;
}
RETRY:
core_stream_reset_char(stream->input);
int ch = core_stream_peek_char(stream->input);
scc_probe_stream_reset(stream->input);
ch = scc_probe_stream_peek(stream->input);
if (ch == '#') {
parse_directive(stream);
scc_pp_parse_directive(stream->input, &stream->pos,
&stream->self->macro_table);
scc_probe_stream_sync(stream->input);
goto RETRY;
} else if ((ch >= 'a' && ch <= 'z') || (ch >= 'A' && ch <= 'Z') ||
ch == '_') {
cstring_t identifier = parse_identifier(stream->input);
smcc_macro_t *macro = find_macro(stream->self, &identifier);
if (macro == null) {
stream_push_string(stream, &identifier);
cstring_free(&identifier);
goto READ_BUF;
} else {
cstring_free(&identifier);
} else if (scc_lex_parse_is_identifier_prefix(ch)) {
cbool ret =
scc_pp_expand_macro(stream->input, &stream->self->macro_table,
&stream->tmp_stream, MAX_MACRO_EXPANSION_DEPTH);
if (ret == false) {
LOG_ERROR("macro_expand_error");
}
if (macro->type == MACRO_OBJECT) {
for (usize i = 0; i < macro->replaces.size; ++i) {
stream_push_string(stream, &vec_at(macro->replaces, i));
// usize never using `-`
if (i + 1 < macro->replaces.size)
stream_push_char(stream, ' ');
}
goto READ_BUF;
} else if (macro->type == MACRO_FUNCTION) {
TODO();
goto READ_BUF;
} else if (scc_probe_stream_next(stream->input) == '/') {
ch = scc_probe_stream_peek(stream->input);
if (ch == '/') {
scc_probe_stream_reset(stream->input);
scc_lex_parse_skip_line(stream->input, &stream->pos);
scc_probe_stream_sync(stream->input);
} else if (ch == '*') {
scc_probe_stream_reset(stream->input);
scc_lex_parse_skip_block_comment(stream->input, &stream->pos);
scc_probe_stream_sync(stream->input);
}
UNREACHABLE();
}
return core_stream_next_char(stream->input);
// 非标识符字符,直接消费并返回
return scc_probe_stream_consume(stream->input);
}
static core_stream_t *pp_stream_init(smcc_pp_t *self, core_stream_t *input) {
pp_stream_t *stream = smcc_malloc(sizeof(pp_stream_t));
static void pp_stream_drop(scc_probe_stream_t *_stream) {
scc_pp_stream_t *stream = (scc_pp_stream_t *)_stream;
Assert(stream != null);
scc_cstring_free(&stream->stream.name);
if (stream->tmp_stream) {
scc_probe_stream_drop(stream->tmp_stream);
}
scc_free(_stream);
}
static scc_probe_stream_t *pp_stream_init(scc_pproc_t *self,
scc_probe_stream_t *input) {
scc_pp_stream_t *stream = scc_malloc(sizeof(scc_pp_stream_t));
if (stream == null) {
LOG_FATAL("Failed to allocate memory for output stream");
}
if (stream == null || self == null) {
return null;
}
if (self == null) {
scc_free(stream);
return null;
}
stream->self = self;
stream->input = input;
stream->size = 0;
stream->pos = 0;
stream->stream.name = cstring_from_cstr("pipe_stream");
stream->stream.free_stream = null;
stream->stream.next_char = next_char;
stream->stream.peek_char = null;
stream->stream.reset_char = null;
stream->tmp_stream = null;
stream->pos = scc_pos_create();
stream->stream.name = scc_cstring_from_cstr("pp_stream");
stream->stream.consume = pp_stream_read_char;
stream->stream.peek = null;
stream->stream.next = null;
stream->stream.sync = null;
stream->stream.reset = null;
stream->stream.back = null;
stream->stream.read_buf = null;
return (core_stream_t *)stream;
stream->stream.is_at_end = null;
stream->stream.drop = pp_stream_drop;
return (scc_probe_stream_t *)stream;
}
core_stream_t *pp_init(smcc_pp_t *pp, core_stream_t *input) {
scc_probe_stream_t *scc_pproc_init(scc_pproc_t *pp, scc_probe_stream_t *input) {
if (pp == null || input == null) {
return null;
}
core_mem_stream_t *stream = smcc_malloc(sizeof(core_mem_stream_t));
if (stream == null) {
LOG_FATAL("Failed to allocate memory for output stream");
}
pp->stream = pp_stream_init(pp, input);
Assert(pp->stream != null);
if (pp->stream == null) {
return null;
}
scc_pp_marco_table_init(&pp->macro_table);
hashmap_init(&pp->macros);
pp->macros.hash_func = (u32 (*)(const void *))hash_func;
pp->macros.key_cmp = (int (*)(const void *, const void *))hash_cmp;
return pp->stream;
}
// 销毁预处理器
void pp_drop(smcc_pp_t *pp) {
if (pp == NULL)
void scc_pproc_drop(scc_pproc_t *pp) {
if (pp == null)
return;
// 清理所有宏定义
// 注意:需要实现 hashmap 的迭代和清理函数
hashmap_drop(&pp->macros);
scc_pp_macro_table_drop(&pp->macro_table);
// 清理字符串池
// strpool_destroy(&pp->strpool);
// 清理条件编译栈
// 需要释放栈中每个元素的资源(如果有的话)
// vec_free(pp->if_stack);
// 清理文件名
cstring_free(&pp->stream->name);
// 清理
if (pp->stream) {
scc_probe_stream_drop(pp->stream);
pp->stream = null;
}
}

View File

@@ -4,7 +4,7 @@
#include <stdlib.h>
#include <utest/acutest.h>
static core_stream_t *from_file_stream(FILE *fp) {
static scc_probe_stream_t *from_file_stream(FILE *fp) {
if (fseek(fp, 0, SEEK_END) != 0) {
perror("fseek failed");
return NULL;
@@ -20,9 +20,9 @@ static core_stream_t *from_file_stream(FILE *fp) {
usize read_ret = fread(buffer, 1, fsize, fp);
fclose(fp);
core_mem_stream_t *mem_stream = malloc(sizeof(core_mem_stream_t));
core_stream_t *stream =
core_mem_stream_init(mem_stream, buffer, fsize, true);
scc_mem_probe_stream_t *mem_stream = malloc(sizeof(scc_mem_probe_stream_t));
scc_probe_stream_t *stream =
scc_mem_probe_stream_init(mem_stream, buffer, read_ret, true);
return stream;
}
@@ -37,28 +37,52 @@ static void test_file(const char *name) {
FILE *fexpect = fopen(expected_fname, "r");
assert(fexpect != NULL);
smcc_pp_t pp;
core_mem_stream_t stream;
core_stream_t *output_stream = pp_init(&pp, from_file_stream(fsrc));
core_stream_t *expect_stream = from_file_stream(fexpect);
while (1) {
int output_ch = core_stream_next_char(output_stream);
int expect_ch = core_stream_next_char(expect_stream);
TEST_CHECK(output_ch == expect_ch);
TEST_MSG("output: %c, expect: %c", output_ch, expect_ch);
if (output_ch == core_stream_eof) {
scc_pproc_t pp;
scc_probe_stream_t *output_stream =
scc_pproc_init(&pp, from_file_stream(fsrc));
scc_probe_stream_t *expect_stream = from_file_stream(fexpect);
fclose(fsrc);
fclose(fexpect);
TEST_CASE(src_fname);
#define BUFFER_LEN 4096
int ch;
char expect_buffer[BUFFER_LEN];
char output_buffer[BUFFER_LEN];
usize size_produced = 0, size_expected = 0;
for (usize i = 0; i < BUFFER_LEN; ++i) {
ch = scc_probe_stream_consume(expect_stream);
if (ch != scc_stream_eof) {
expect_buffer[i] = (char)ch;
} else {
size_expected = i;
break;
}
}
pp_drop(&pp);
for (usize i = 0; i < BUFFER_LEN; ++i) {
ch = scc_probe_stream_consume(output_stream);
if (ch != scc_stream_eof) {
output_buffer[i] = (char)ch;
} else {
size_produced = i;
break;
}
}
TEST_CHECK(size_produced == size_expected &&
memcmp(output_buffer, expect_buffer, size_produced) == 0);
TEST_DUMP("Expected:", expect_buffer, size_expected);
TEST_DUMP("Produced:", output_buffer, size_produced);
scc_pproc_drop(&pp);
}
static void test_basic(void) {
char name[32];
// for (int i = 1; i <= 22; ++i) {
// snprintf(name, sizeof(name), "%02d", i);
// test_file(name);
// }
for (int i = 1; i <= 22; ++i) {
snprintf(name, sizeof(name), "%02d", i);
test_file(name);
}
}
TEST_LIST = {

View File

@@ -2,19 +2,19 @@
#include <stdio.h>
int main(void) {
smcc_pp_t pp;
core_mem_stream_t input;
core_stream_t *output;
scc_pproc_t pp;
scc_mem_probe_stream_t input;
scc_probe_stream_t *output;
const char buf[] = "#define A 123 \"asd\"\nA A A\n";
output =
pp_init(&pp, core_mem_stream_init(&input, buf, sizeof(buf) - 1, false));
output = scc_pproc_init(
&pp, scc_mem_probe_stream_init(&input, buf, sizeof(buf) - 1, false));
int ch = 0;
while (1) {
ch = core_stream_next_char(output);
if (ch == core_stream_eof) {
ch = scc_probe_stream_consume(output);
if (ch == scc_stream_eof) {
break;
}
putc(ch, stdout);

View File

@@ -3,45 +3,50 @@
#include <string.h>
#include <utest/acutest.h>
static cbool process_input(const char *input, cstring_t *output) {
smcc_pp_t pp;
core_mem_stream_t mem_stream;
core_stream_t *output_stream;
static cbool process_input(const char *input, scc_cstring_t *output) {
scc_pproc_t pp;
scc_mem_probe_stream_t mem_stream;
scc_probe_stream_t *output_stream;
// 初始化预处理器
output_stream = pp_init(
&pp, core_mem_stream_init(&mem_stream, input, strlen(input), false));
output_stream =
scc_pproc_init(&pp, scc_mem_probe_stream_init(&mem_stream, input,
strlen(input), false));
// 获取输出结果
int ch;
*output = cstring_new();
*output = scc_cstring_create();
while (1) {
ch = core_stream_next_char(output_stream);
if (ch == core_stream_eof) {
ch = scc_probe_stream_consume(output_stream);
if (ch == scc_stream_eof) {
break;
}
cstring_push(output, (char)ch);
scc_cstring_append_ch(output, (char)ch);
}
// 清理资源
pp_drop(&pp);
scc_pproc_drop(&pp);
return true;
}
#define CHECK_PP_OUTPUT_EXACT(input, expect) \
do { \
cstring_t output; \
scc_cstring_t output; \
process_input(input, &output); \
assert(output.data != NULL); \
TEST_CHECK(strcmp(output.data, expect) == 0); \
TEST_MSG("Expected: %s", expect); \
TEST_MSG("Produced: %s", output.data); \
} while (0)
#define CHECK_PP_OUTPUT_CONTAIN(input, expect) \
do { \
cstring_t output; \
scc_cstring_t output; \
process_input(input, &output); \
assert(output.data != NULL); \
TEST_CHECK(strstr(output.data, expect) != NULL); \
TEST_MSG("Expected: %s", expect); \
TEST_MSG("Produced: %s", output.data); \
} while (0)
static void test_define_simple_object_macro(void) {
@@ -56,6 +61,15 @@ static void test_define_complex_object_macro(void) {
CHECK_PP_OUTPUT_EXACT("#define PI 3.14159\nPI\n", "3.14159\n");
}
static void test_define_object_macro_backspace(void) {
TEST_CASE("object-like macro check backspace");
CHECK_PP_OUTPUT_EXACT("#define MAX 100\nMAX\n", "100\n");
CHECK_PP_OUTPUT_EXACT("#define NAME \ttest\r\nNAME\n", "test\n");
CHECK_PP_OUTPUT_EXACT("#define \tVALUE (100 \t+ 50)\nVALUE\n",
"(100 + 50)\n");
CHECK_PP_OUTPUT_EXACT("#define \tPI \t 3.14159\nPI\n", "3.14159\n");
}
static void test_define_function_macro(void) {
TEST_CASE("function-like macro");
CHECK_PP_OUTPUT_EXACT("#define ADD(a,b) a + b\nADD(1, 2)\n", "1 + 2\n");
@@ -76,7 +90,7 @@ static void test_define_concat_operator(void) {
TEST_CASE("concatenation operator (##)");
CHECK_PP_OUTPUT_EXACT("#define CONCAT(a,b) a##b\nCONCAT(hello,world)\n",
"helloworld\n");
CHECK_PP_OUTPUT_EXACT("#define JOIN(pre,suf) pre##suf\nJOIN(var,123)\n",
CHECK_PP_OUTPUT_EXACT("#define JOIN(pre,suf) pre ## suf\nJOIN(var, 123)\n",
"var123\n");
}
@@ -90,12 +104,112 @@ static void test_define_nested_macros(void) {
"((1 + 1) + 1)\n");
}
static void test_undef_macros(void) {
TEST_CASE("test_undef_macros");
CHECK_PP_OUTPUT_EXACT("#define x 1\n"
"x\n"
"#undef x\n"
"x\n"
"#define x 2\n"
"x\n",
"1\nx\n2\n");
}
static void hard_test_define_func_macros(void) {
TEST_CASE("func_macros_hard with pp_01");
CHECK_PP_OUTPUT_EXACT("#define hash_hash # ## #\n"
"#define mkstr(a) # a\n"
"#define in_between(a) mkstr(a)\n"
"#define join(c, d) in_between(c hash_hash d)\n"
"char p[] = join(x, y);\n",
"char p[] = \"x ## y\";\n");
TEST_CASE("func_macros_hard with recursive define");
CHECK_PP_OUTPUT_EXACT("#define M1(x) M2(x + 1)\n"
"#define M2(x) M1(x * 2)\n"
"M1(5)\n",
"M1(5 + 1 * 2)\n");
CHECK_PP_OUTPUT_EXACT("#define A B\n"
"#define B C\n"
"#define C 1\n"
"A\n",
"1\n");
TEST_CASE("func_macros_hard with self recursive call");
CHECK_PP_OUTPUT_EXACT("#define M(x) x\n"
"M(M(10))\n",
"10\n");
CHECK_PP_OUTPUT_EXACT("#define M(x) M(x)\n"
"#define N(x) x\n"
"N(M(1))\n",
"M(1)\n");
TEST_CASE("func_macros_hard with define by macro");
CHECK_PP_OUTPUT_EXACT("#define M1(x) M1(x + 1)\n"
"#define M2 M1\n"
"#define M3(x) x\n"
"M3(M3(M2)(0))\n",
"M1(0 + 1)\n");
TEST_CASE("TODO");
CHECK_PP_OUTPUT_EXACT("#define str(x) # x\n"
"str()\n",
"\"\"\n");
TEST_CASE("TODO");
CHECK_PP_OUTPUT_EXACT("#define x 1\n"
"#define f(a) f(x * (a))\n"
"f(0)\n"
"f(x)",
"f(1 * (0))\n"
"f(1 * (1))");
CHECK_PP_OUTPUT_EXACT("#define x x(0)\n"
"#define f(a) f(x * (a))\n"
"f(f(0))\n"
"f(f(x))\n"
"f(f(a))\n",
"f(x(0) * (f(x(0) * (0))))\n"
"f(x(0) * (f(x(0) * (x(0)))))\n"
"f(x(0) * (f(x(0) * (a))))\n");
}
static void test_conditional_compilation(void) {
TEST_CASE("conditional compilation");
CHECK_PP_OUTPUT_EXACT("#if 1\ntrue\n#endif\n", "true\n");
CHECK_PP_OUTPUT_EXACT("#if 0\nfalse\n#endif\n", "");
CHECK_PP_OUTPUT_EXACT("#define FLAG 1\n#if FLAG\ntrue\n#endif\n", "true\n");
}
static void test_error_cases(void) {
TEST_CASE("macro redefinition");
// 应检测到警告或错误
// CHECK_PP_OUTPUT_CONTAIN("#define A 1\n#define A 2\n", "warning");
TEST_CASE("undefined macro");
CHECK_PP_OUTPUT_EXACT("UNDEFINED_MACRO\n", "UNDEFINED_MACRO\n");
}
static void test_edge_cases(void) {
TEST_CASE("empty macro");
CHECK_PP_OUTPUT_EXACT("#define EMPTY\nEMPTY\n", "\n");
TEST_CASE("macro with only spaces");
CHECK_PP_OUTPUT_EXACT("#define SPACE \nSPACE\n", "\n");
TEST_CASE("deep nesting");
CHECK_PP_OUTPUT_EXACT("#define A B\n#define B C\n#define C 1\nA\n", "1\n");
}
#define TEST_LIST_CASE(func_name) {#func_name, func_name}
TEST_LIST = {
{"test_define_simple_object_macro", test_define_simple_object_macro},
{"test_define_complex_object_macro", test_define_complex_object_macro},
{"test_define_function_macro", test_define_function_macro},
{"test_define_stringify_operator", test_define_stringify_operator},
{"test_define_concat_operator", test_define_concat_operator},
{"test_define_nested_macros", test_define_nested_macros},
TEST_LIST_CASE(test_define_simple_object_macro),
TEST_LIST_CASE(test_define_complex_object_macro),
TEST_LIST_CASE(test_define_object_macro_backspace),
TEST_LIST_CASE(test_define_function_macro),
TEST_LIST_CASE(test_define_stringify_operator),
TEST_LIST_CASE(test_define_concat_operator),
TEST_LIST_CASE(test_define_nested_macros),
TEST_LIST_CASE(test_undef_macros),
TEST_LIST_CASE(hard_test_define_func_macros),
{NULL, NULL},
};

View File

@@ -1,14 +0,0 @@
#ifndef __SCC_CORE_H__
#define __SCC_CORE_H__
#include <core_log.h>
#include <core_impl.h>
#include <core_macro.h>
#include <core_mem.h>
#include <core_pos.h>
#include <core_str.h>
#include <core_stream.h>
#include <core_vec.h>
#endif // __SCC_CORE_H__

View File

@@ -1,7 +0,0 @@
[package]
name = "libutils"
version = "0.1.0"
dependencies = [
{ name = "core", path = "../libcore" }
]

View File

@@ -1,9 +0,0 @@
#ifndef __SMCC_UTILS_H__
#define __SMCC_UTILS_H__
#include "hashmap.h"
#include "kllist.h"
#include "strpool.h"
#include <libcore.h>
#endif // __SMCC_UTILS_H__

View File

@@ -1,27 +0,0 @@
#include "strpool.h"
void init_strpool(strpool_t *pool) {
pool->ht.hash_func = (u32 (*)(const void *))scc_strhash32;
pool->ht.key_cmp = (int (*)(const void *, const void *))scc_strcmp;
hashmap_init(&pool->ht);
}
const char *strpool_intern(strpool_t *pool, const char *str) {
void *existing = hashmap_get(&pool->ht, str);
if (existing) {
return existing;
}
usize len = scc_strlen(str) + 1;
char *new_str = scc_malloc(len);
if (!new_str) {
LOG_ERROR("strpool: Failed to allocate memory for string");
return NULL;
}
scc_memcpy(new_str, str, len);
hashmap_set(&pool->ht, new_str, new_str);
return new_str;
}
void strpool_destroy(strpool_t *pool) { hashmap_drop(&pool->ht); }

View File

@@ -1,112 +0,0 @@
# vector_gdb.py
import gdb # type: ignore
from gdb.printing import PrettyPrinter # type: ignore
class VectorPrinter:
"""兼容新旧注册方式的最终方案"""
def __init__(self, val: gdb.Value):
self.val:gdb.Value = val
def check_type(self) -> bool:
"""类型检查(兼容匿名结构体)"""
try:
if self.val.type.code != gdb.TYPE_CODE_STRUCT:
return False
fields = self.val.type.fields()
if not fields:
return False
exp = ['size', 'cap', 'data']
for t in fields:
if t.name in exp:
exp.remove(t.name)
else:
return False
return True
except gdb.error:
return False
def to_string(self):
if not self.check_type():
return "Not a vector"
return "vector({} size={}, cap={})".format(
self.val.address,
self.val['size'],
self.val['cap'],
)
def display_hint(self):
return 'array'
def children(self):
"""生成数组元素(关键改进点)"""
if not self.check_type():
return []
size = int(self.val['size'])
cap = int(self.val['cap'])
data_ptr = self.val['data']
if cap == 0 or data_ptr == 0:
return []
# 使用 GDB 内置数组转换
array = data_ptr.dereference()
array = array.cast(data_ptr.type.target().array(cap - 1))
for i in range(size):
# state = "<used>" if i < size else "<unused>"
try:
value = array[i]
yield (f"[{i}] {value.type} {value.address}", value)
except gdb.MemoryError:
yield (f"[{i}]", "<invalid>")
# 注册方式一传统append方法您之前有效的方式self
def append_printer():
gdb.pretty_printers.append(
lambda val: VectorPrinter(val) if VectorPrinter(val).check_type() else None
)
# 注册方式二:新版注册方法(备用方案)
def register_new_printer():
class VectorPrinterLocator(PrettyPrinter):
def __init__(self):
super().__init__("vector_printer")
def __call__(self, val):
ret = VectorPrinter(val).check_type()
print(f"ret {ret}, type {val.type}, {[(i.name, i.type) for i in val.type.fields()]}")
return None
gdb.printing.register_pretty_printer(
gdb.current_objfile(),
VectorPrinterLocator()
)
# 双重注册保证兼容性
append_printer() # 保留您原来有效的方式
# register_new_printer() # 添加新版注册
class VectorInfoCommand(gdb.Command):
"""保持原有命令不变"""
def __init__(self):
super().__init__("vector_info", gdb.COMMAND_USER)
def invoke(self, argument, from_tty):
val = gdb.parse_and_eval(argument)
printer = VectorPrinter(val)
if not printer.check_type():
print("Invalid vector")
return
print("=== Vector Details ===")
print("Size:", val['size'])
print("Capacity:", val['cap'])
print("Elements:")
for name, value in printer.children():
print(f" {name}: {value}")
VectorInfoCommand()

View File

@@ -79,12 +79,6 @@ void init_logger(logger_t *logger, const char *name) {
log_set_level(logger, LOG_LEVEL_ALL);
}
logger_t *log_get(const char *name) {
// TODO for -Wunused-parameter
(void)name;
return &__default_logger_root;
}
void log_set_level(logger_t *logger, int level) {
if (logger)
logger->level = level;
@@ -98,9 +92,3 @@ void log_set_handler(logger_t *logger, log_handler handler) {
else
__default_logger_root.handler = handler;
}
void logger_destroy(logger_t *logger) {
// TODO for -Wunused-parameter
(void)logger;
return;
}

View File

@@ -96,16 +96,6 @@ extern logger_t __default_logger_root;
*/
void init_logger(logger_t *logger, const char *name);
// TODO log_set(); 暂未实现 日志注册
/**
* @brief 获取或创建日志器实例
* @param[in] name 日志器名称NULL表示获取默认日志器
* @return 日志器实例指针
* @warning 若没有找到相应日志器则会返回根日志器
*/
logger_t *log_get(const char *name);
/**
* @brief 设置日志级别
* @param[in] logger 目标日志器实例
@@ -120,11 +110,6 @@ void log_set_level(logger_t *logger, int level);
*/
void log_set_handler(logger_t *logger, log_handler handler);
/**
* @todo TODO impliment
*/
void logger_destroy(logger_t *logger);
#ifndef LOG_MAX_MAROC_BUF_SIZE
#define LOG_MAX_MAROC_BUF_SIZE LOGGER_MAX_BUF_SIZE ///< 宏展开缓冲区尺寸
#endif

222
runtime/runtime_gdb.py Normal file
View File

@@ -0,0 +1,222 @@
"https://sourceware.org/gdb/current/onlinedocs/gdb.html/Python-API.html#Python-API"
import gdb # type: ignore
class VectorPrinter(gdb.ValuePrinter):
"""兼容新旧注册方式的最终方案"""
def __init__(self, val: gdb.Value):
self.val: gdb.Value = val
@staticmethod
def check_type(val: gdb.Value) -> bool:
"""类型检查(兼容匿名结构体)"""
try:
if val.type.code not in (gdb.TYPE_CODE_STRUCT, gdb.TYPE_CODE_TYPEDEF):
return False
if val.type.name == "scc_cstring_t":
return False
fields = val.type.fields()
if not fields:
return False
exp = ["size", "cap", "data"]
for t in fields:
if t.name in exp:
exp.remove(t.name)
else:
return False
return True
except gdb.error:
return False
except ValueError:
return False
except TypeError:
return False
except Exception as e:
print(f"[DEBUG] Unknown exception type: {type(e).__name__}")
print(f"[DEBUG] Exception details: {e}")
print(
f"[DEBUG] val type: {val.type if hasattr(val, 'type') else 'no type attr'}"
)
return False
def to_string(self):
"""
GDB will call this method to display the string representation
of the value passed to the object's constructor.
This is a basic method, and is optional.
When printing from the CLI, if the to_string method exists,
then GDB will prepend its result to the values returned by children.
Exactly how this formatting is done is dependent on the display hint,
and may change as more hints are added. Also, depending on the print settings
(see Print Settings), the CLI may print just the result of to_string
in a stack trace, omitting the result of children.
If this method returns a string, it is printed verbatim.
Otherwise, if this method returns an instance of gdb.Value,
then GDB prints this value. This may result in a call to another pretty-printer.
If instead the method returns a Python value which is convertible to a gdb.Value,
then GDB performs the conversion and prints the resulting value. Again,
this may result in a call to another pretty-printer. Python scalars
(integers, floats, and booleans) and strings are convertible to gdb.Value;
other types are not.
Finally, if this method returns None then no further operations are performed
in this method and nothing is printed.
If the result is not one of these types, an exception is raised.
https://sourceware.org/gdb/current/onlinedocs/gdb.html/Pretty-Printing-API.html#Pretty-Printing-API
"""
return (
f"vector({self.val.address} size={self.val['size']}, cap={self.val['cap']})"
)
def display_hint(self):
"""
The CLI may call this method and use its result to change the formatting of a value.
The result will also be supplied to an MI consumer as a 'displayhint'
attribute of the variable being printed.
This is a basic method, and is optional. If it does exist,
this method must return a string or the special value None.
Some display hints are predefined by GDB:
'array'
Indicate that the object being printed is “array-like”.
The CLI uses this to respect parameters such as set print elements and set print array.
'map'
Indicate that the object being printed is “map-like”,
and that the children of this value can be assumed to alternate between keys and values.
'string'
Indicate that the object being printed is “string-like”.
If the printer's to_string method returns a Python string of some kind,
then GDB will call its internal language-specific string-printing function
to format the string. For the CLI this means adding quotation marks, possibly
escaping some characters, respecting set print elements, and the like.
The special value None causes GDB to apply the default display rules.
https://sourceware.org/gdb/current/onlinedocs/gdb.html/Pretty-Printing-API.html#Pretty-Printing-API
"""
return "array"
def num_children(self):
"""
This is not a basic method, so GDB will only ever call it for objects
derived from gdb.ValuePrinter.
If available, this method should return the number of children.
None may be returned if the number can't readily be computed.
https://sourceware.org/gdb/current/onlinedocs/gdb.html/Pretty-Printing-API.html#Pretty-Printing-API
"""
return int(self.val["size"])
def children(self):
"""
This is not a basic method, so GDB will only ever call it for objects
derived from gdb.ValuePrinter.
If available, this method should return the child item (that is,
a tuple holding the name and value of this child) indicated by n.
Indices start at zero.
GDB provides a function which can be used to look up the default
pretty-printer for a gdb.Value:
https://sourceware.org/gdb/current/onlinedocs/gdb.html/Pretty-Printing-API.html#Pretty-Printing-API
"""
size = int(self.val["size"])
cap = int(self.val["cap"])
data_ptr = self.val["data"]
if cap == 0 or data_ptr == 0:
return []
# 使用 GDB 内置数组转换
array = data_ptr.dereference()
array = array.cast(data_ptr.type.target().array(cap - 1))
for i in range(size):
# state = "<used>" if i < size else "<unused>"
try:
value = array[i]
yield (f"[{i}] {value.type} {value.address}", value)
except gdb.MemoryError:
yield (f"[{i}]", "<invalid>")
class HashTablePrinter(gdb.ValuePrinter):
def __init__(self, val: gdb.Value):
self.val: gdb.Value = val
@staticmethod
def check_type(val: gdb.Value) -> bool:
if val.type.name in ["scc_hashtable_t", "scc_hashtable"]:
return True
return False
def append_printer():
"注册方式一传统append方法您之前有效的方式self"
gdb.pretty_printers.append(
lambda val: VectorPrinter(val) if VectorPrinter.check_type(val) else None
)
def register_new_printer():
"注册方式二:新版注册方法(备用方案)"
def str_lookup_function(val):
if VectorPrinter.check_type(val) is False:
return None
ret = VectorPrinter(val)
# print(
# f"ret {ret}, type {val.type.name}, {[(i.name, i.type) for i in val.type.fields()]}"
# )
return ret
gdb.printing.register_pretty_printer(gdb.current_objfile(), str_lookup_function)
# if gdb.current_progspace() is not None:
# pts = gdb.current_progspace().pretty_printers
# print(pts, len(pts))
# pts.append(str_lookup_function)
class VectorInfoCommand(gdb.Command):
"""保持原有命令不变"""
def __init__(self):
super().__init__("vector_info", gdb.COMMAND_USER)
def invoke(self, argument, from_tty):
val = gdb.parse_and_eval(argument)
if not VectorPrinter.check_type(val):
print("Invalid vector")
return
printer = VectorPrinter(val)
print("=== Vector Details ===")
print("Size:", val["size"])
print("Capacity:", val["cap"])
print("Elements:")
for name, value in printer.children():
print(f" {name}: {value}")
if __name__ == "__main__":
# 双重注册保证兼容性
# append_printer() # 保留您原来有效的方式
register_new_printer() # 添加新版注册
VectorInfoCommand()

View File

@@ -1,5 +1,5 @@
[package]
name = "libcore"
name = "scc_core"
version = "0.1.0"
default_features = ["std_impl"]

View File

@@ -0,0 +1,14 @@
#ifndef __SCC_CORE_H__
#define __SCC_CORE_H__
#include <scc_core_log.h>
#include <scc_core_impl.h>
#include <scc_core_macro.h>
#include <scc_core_mem.h>
#include <scc_core_pos.h>
#include <scc_core_str.h>
#include <scc_core_stream.h>
#include <scc_core_vec.h>
#endif // __SCC_CORE_H__

View File

@@ -1,7 +1,7 @@
#ifndef __SCC_CORE_IMPL_H__
#define __SCC_CORE_IMPL_H__
#include "core_type.h"
#include "scc_core_type.h"
/* ====== 内存管理核心接口 ====== */

View File

@@ -1,7 +1,7 @@
#ifndef __SCC_CORE_MEM_H__
#define __SCC_CORE_MEM_H__
#include "core_type.h"
#include "scc_core_type.h"
void *scc_memcpy(void *dest, const void *src, usize n);
void *scc_memmove(void *dest, const void *src, usize n);

View File

@@ -1,8 +1,8 @@
#ifndef __SCC_CORE_POS_H__
#define __SCC_CORE_POS_H__
#include "core_str.h"
#include "core_type.h"
#include "scc_core_str.h"
#include "scc_core_type.h"
typedef struct scc_pos {
scc_cstring_t name;
usize line;
@@ -10,16 +10,16 @@ typedef struct scc_pos {
usize offset;
} scc_pos_t;
static inline scc_pos_t scc_pos_init() {
return (scc_pos_t){scc_cstring_new(), 1, 1, 0};
static inline scc_pos_t scc_pos_create() {
return (scc_pos_t){scc_cstring_create(), 1, 1, 0};
}
static inline void core_pos_next(scc_pos_t *pos) {
static inline void scc_pos_next(scc_pos_t *pos) {
pos->offset++;
pos->col++;
}
static inline void core_pos_next_line(scc_pos_t *pos) {
static inline void scc_pos_next_line(scc_pos_t *pos) {
pos->offset++;
pos->line++;
pos->col = 1;

View File

@@ -1,9 +1,10 @@
#ifndef __SCC_CORE_STR_H__
#define __SCC_CORE_STR_H__
#include "core_impl.h"
#include "core_log.h"
#include "core_type.h"
#include "scc_core_impl.h"
#include "scc_core_log.h"
#include "scc_core_mem.h"
#include "scc_core_type.h"
/**
* @brief
@@ -20,10 +21,17 @@ typedef struct scc_cstring {
*
* @return cstring_t
*/
static inline scc_cstring_t scc_cstring_new(void) {
static inline scc_cstring_t scc_cstring_create(void) {
return (scc_cstring_t){.data = null, .size = 0, .cap = 0};
}
static inline void scc_cstring_init(scc_cstring_t *string) {
Assert(string != null);
string->data = null;
string->size = 0;
string->cap = 0;
}
/**
* @brief C
*
@@ -32,7 +40,7 @@ static inline scc_cstring_t scc_cstring_new(void) {
*/
static inline scc_cstring_t scc_cstring_from_cstr(const char *s) {
if (s == null) {
return scc_cstring_new();
return scc_cstring_create();
}
usize len = 0;
@@ -48,6 +56,10 @@ static inline scc_cstring_t scc_cstring_from_cstr(const char *s) {
return (scc_cstring_t){.size = len + 1, .cap = len + 1, .data = data};
}
static inline scc_cstring_t scc_cstring_copy(const scc_cstring_t *s) {
return scc_cstring_from_cstr(s->data);
}
/**
* @brief
*
@@ -57,7 +69,7 @@ static inline void scc_cstring_free(scc_cstring_t *str) {
if (str == null) {
return;
}
if (str->cap != 0 && str->data != null) {
if (str->data != null) {
scc_free(str->data);
str->data = null;
}
@@ -136,7 +148,13 @@ static inline void scc_cstring_append_ch(scc_cstring_t *str, char ch) {
* @return usize
*/
static inline usize scc_cstring_len(const scc_cstring_t *str) {
return str ? str->size - 1 : 0;
if (str == null) {
return 0;
}
if (str->size == 0) {
return 0;
}
return str->size - 1;
}
/**
@@ -171,9 +189,20 @@ static inline void scc_cstring_clear(scc_cstring_t *str) {
*/
static inline char *scc_cstring_as_cstr(const scc_cstring_t *str) {
if (str == null || str->data == null) {
return "";
return null;
}
return str->data;
}
static inline char *scc_cstring_move_cstr(scc_cstring_t *str) {
if (str == null || str->data == null) {
return null;
}
char *ret = str->data;
str->data = null;
str->cap = 0;
str->size = 0;
return ret;
}
#endif /* __SCC_CORE_STR_H__ */

View File

@@ -1,15 +1,15 @@
#ifndef __SMCC_CORE_PROBE_STREAM_H__
#define __SMCC_CORE_PROBE_STREAM_H__
#include "core_impl.h"
#include "core_macro.h"
#include "core_mem.h"
#include "core_str.h"
#include "scc_core_impl.h"
#include "scc_core_macro.h"
#include "scc_core_mem.h"
#include "scc_core_str.h"
struct scc_probe_stream;
typedef struct scc_probe_stream scc_probe_stream_t;
#define core_stream_eof (-1)
#define scc_stream_eof (-1)
/**
* @brief
@@ -100,21 +100,31 @@ typedef struct scc_mem_probe_stream {
usize data_length;
usize curr_pos; // 当前读取位置
usize probe_pos; // 探针位置用于peek
cbool owned; // 是否拥有数据(需要释放)
cbool owned; // 是否拥有数据(如果拥有将会自动释放)
} scc_mem_probe_stream_t;
/**
* @brief
* @brief (scc_mem_probe_stream_t的释放)
*
* @param stream
* @param data
* @param length
* @param need_copy
* @param owned
* @return core_probe_stream_t* NULL
*/
scc_probe_stream_t *scc_mem_probe_stream_init(scc_mem_probe_stream_t *stream,
const char *data, usize length,
cbool need_copy);
const char *data, usize length,
cbool owned);
/**
* @brief (drop会自动释放内存)
*
* @param data
* @param length
* @param owned
* @return scc_probe_stream_t*
*/
scc_probe_stream_t *scc_mem_probe_stream_alloc(const char *data, usize length,
cbool owned);
#endif
#endif /* __SMCC_CORE_PROBE_STREAM_H__ */

View File

@@ -8,11 +8,34 @@
#ifndef __SCC_CORE_VEC_H__
#define __SCC_CORE_VEC_H__
#include "core_impl.h"
#include "core_type.h"
#ifndef __SCC_CORE_VEC_USE_STD__
#include "scc_core_impl.h"
#include "scc_core_type.h"
#define __scc_vec_realloc scc_realloc
#define __scc_vec_free scc_free
#else
#include <stddef.h>
#include <stdlib.h>
typedef size_t usize;
#define __scc_vec_realloc realloc
#define __scc_vec_free free
#ifndef LOG_FATAL
#include <stdio.h>
#define LOG_FATAL(...) \
do { \
printf(__VA_ARGS__); \
exit(1); \
} while (0)
#endif
#ifndef Assert
#include <assert.h>
#define Assert(cond) assert(cond)
#endif
#endif
/** @defgroup vec_struct 数据结构定义 */
@@ -51,6 +74,22 @@
(vec).size = 0, (vec).cap = 0, (vec).data = 0; \
} while (0)
#define scc_vec_realloc(vec, new_cap) \
do { \
void *data = \
__scc_vec_realloc((vec).data, new_cap * sizeof(*(vec).data)); \
if (!data) { \
LOG_FATAL("vector_push: realloc failed\n"); \
} \
(vec).cap = new_cap; \
(vec).data = data; \
} while (0)
#define scc_vec_size(vec) ((vec).size)
#define scc_vec_cap(vec) ((vec).cap)
#define scc_vec_foreach(vec, idx) \
for (usize idx = 0; idx < scc_vec_size(vec); ++idx)
/**
* @def scc_vec_push(vec, value)
* @brief
@@ -64,13 +103,7 @@
do { \
if ((vec).size >= (vec).cap) { \
int cap = (vec).cap ? (vec).cap * 2 : 4; \
void *data = \
__scc_vec_realloc((vec).data, cap * sizeof(*(vec).data)); \
if (!data) { \
LOG_FATAL("vector_push: realloc failed\n"); \
} \
(vec).cap = cap; \
(vec).data = data; \
scc_vec_realloc(vec, cap); \
} \
Assert((vec).data != null); \
(vec).data[(vec).size++] = value; \

View File

@@ -2,7 +2,7 @@
#define _CRT_SECURE_NO_WARNINGS
#endif
#include <core_impl.h>
#include <scc_core_impl.h>
#define __SCC_LOG_IMPORT_SRC__
#define log_snprintf scc_snprintf
#define log_printf scc_printf

View File

@@ -1,4 +1,4 @@
#include <core_mem.h>
#include <scc_core_mem.h>
// 判断是否支持非对齐访问x86/x64 支持)
#if defined(__i386__) || defined(__x86_64__) || defined(_M_IX86) || \

View File

@@ -1,5 +1,5 @@
#include <core_log.h>
#include <core_stream.h>
#include <scc_core_log.h>
#include <scc_core_stream.h>
#ifndef __SCC_CORE_NO_MEM_PROBE_STREAM__
@@ -8,7 +8,7 @@ static int mem_probe_stream_consume(scc_probe_stream_t *_stream) {
scc_mem_probe_stream_t *stream = (scc_mem_probe_stream_t *)_stream;
if (stream->curr_pos >= stream->data_length) {
return core_stream_eof;
return scc_stream_eof;
}
unsigned char ch = stream->data[stream->curr_pos++];
@@ -24,7 +24,7 @@ static int mem_probe_stream_peek(scc_probe_stream_t *_stream) {
scc_mem_probe_stream_t *stream = (scc_mem_probe_stream_t *)_stream;
if (stream->probe_pos >= stream->data_length) {
return core_stream_eof;
return scc_stream_eof;
}
// 只查看而不移动探针位置
@@ -36,7 +36,7 @@ static int mem_probe_stream_next(scc_probe_stream_t *_stream) {
scc_mem_probe_stream_t *stream = (scc_mem_probe_stream_t *)_stream;
if (stream->probe_pos >= stream->data_length) {
return core_stream_eof;
return scc_stream_eof;
}
// 返回探针位置的字符,并将探针位置向前移动
@@ -111,7 +111,7 @@ static cbool mem_probe_stream_is_at_end(scc_probe_stream_t *_stream) {
return stream->curr_pos >= stream->data_length;
}
static void mem_probe_stream_destroy(scc_probe_stream_t *_stream) {
static void mem_probe_stream_drop(scc_probe_stream_t *_stream) {
Assert(_stream != null);
scc_mem_probe_stream_t *stream = (scc_mem_probe_stream_t *)_stream;
@@ -125,7 +125,7 @@ static void mem_probe_stream_destroy(scc_probe_stream_t *_stream) {
scc_probe_stream_t *scc_mem_probe_stream_init(scc_mem_probe_stream_t *stream,
const char *data, usize length,
cbool need_copy) {
cbool owned) {
if (stream == null || data == null) {
LOG_ERROR("param error");
return null;
@@ -133,22 +133,11 @@ scc_probe_stream_t *scc_mem_probe_stream_init(scc_mem_probe_stream_t *stream,
if (length == 0) {
LOG_WARN("input memory is empty");
need_copy = false;
owned = false;
}
stream->owned = need_copy;
if (need_copy) {
char *buf = (char *)scc_malloc(length);
if (buf == null) {
LOG_ERROR("malloc error");
return null;
}
scc_memcpy(buf, data, length);
stream->data = buf;
} else {
stream->data = data;
}
stream->owned = owned;
stream->data = data;
stream->data_length = length;
stream->curr_pos = 0;
stream->probe_pos = 0;
@@ -164,9 +153,30 @@ scc_probe_stream_t *scc_mem_probe_stream_init(scc_mem_probe_stream_t *stream,
stream->stream.read_buf = mem_probe_stream_read_buf;
stream->stream.reset = mem_probe_stream_reset;
stream->stream.is_at_end = mem_probe_stream_is_at_end;
stream->stream.drop = mem_probe_stream_destroy;
stream->stream.drop = mem_probe_stream_drop;
return (scc_probe_stream_t *)stream;
}
static void scc_owned_mem_stream_drop(scc_probe_stream_t *_stream) {
scc_mem_probe_stream_t *stream = (scc_mem_probe_stream_t *)_stream;
mem_probe_stream_drop(_stream);
scc_free(stream);
}
scc_probe_stream_t *scc_mem_probe_stream_alloc(const char *data, usize length,
cbool owned) {
scc_mem_probe_stream_t *stream =
(scc_mem_probe_stream_t *)scc_malloc(sizeof(scc_mem_probe_stream_t));
if (stream == null) {
return null;
}
scc_probe_stream_t *ret =
scc_mem_probe_stream_init(stream, data, length, owned);
stream->stream.drop = scc_owned_mem_stream_drop;
Assert(ret != null);
return ret;
}
#endif /* __SCC_CORE_NO_MEM_PROBE_STREAM__ */

View File

@@ -0,0 +1,5 @@
[package]
name = "scc_utils"
version = "0.1.0"
dependencies = [{ name = "core", path = "../scc_core" }]

View File

@@ -1,48 +1,48 @@
/**
* @file hashmap.h
* @file hashtable.h
* @brief
*
*
*/
#ifndef __SCC_HASHMAP_H__
#define __SCC_HASHMAP_H__
#ifndef __SCC_HASHTABLE_H__
#define __SCC_HASHTABLE_H__
#include <libcore.h>
#include <scc_core.h>
/**
* @enum hp_entry_state_t
* @brief
*/
typedef enum hashmap_entry_state {
typedef enum scc_hashtable_entry_state {
ENTRY_EMPTY, /**< 空槽位(从未使用过) */
ENTRY_ACTIVE, /**< 有效条目(包含键值对) */
ENTRY_TOMBSTONE /**< 墓碑标记(已删除条目) */
} hp_entry_state_t;
} scc_hashtable_entry_state_t;
/**
* @struct hashmap_entry_t
* @struct scc_hashtable_entry_t
* @brief
*
* @note key/value内存由调用者管理
*/
typedef struct hashmap_entry {
const void *key; /**< 键指针(不可变) */
void *value; /**< 值指针 */
u32 hash; /**< 预计算的哈希值(避免重复计算) */
hp_entry_state_t state; /**< 当前条目状态 */
} hashmap_entry_t;
typedef struct scc_hashtable_entry {
const void *key; /**< 键指针(不可变) */
void *value; /**< 值指针 */
u32 hash; /**< 预计算的哈希值(避免重复计算) */
scc_hashtable_entry_state_t state; /**< 当前条目状态 */
} scc_hashtable_entry_t;
/**
* @struct hashmap_t
* @struct scc_hashtable_t
* @brief
*
* 使
*/
typedef struct smcc_hashmap {
SCC_VEC(hashmap_entry_t) entries; /**< 条目存储容器 */
u32 count; /**< 有效条目数量(不含墓碑) */
u32 tombstone_count; /**< 墓碑条目数量 */
typedef struct scc_hashtable {
SCC_VEC(scc_hashtable_entry_t) entries; /**< 条目存储容器 */
u32 count; /**< 有效条目数量(不含墓碑) */
u32 tombstone_count; /**< 墓碑条目数量 */
/**
* @brief
* @param key
@@ -56,7 +56,7 @@ typedef struct smcc_hashmap {
* @return 00
*/
int (*key_cmp)(const void *key1, const void *key2);
} hashmap_t;
} scc_hashtable_t;
/**
* @brief
@@ -64,7 +64,7 @@ typedef struct smcc_hashmap {
*
* @warning hash_func和key_cmp后才能使用
*/
void hashmap_init(hashmap_t *ht);
void scc_hashtable_init(scc_hashtable_t *ht);
/**
* @brief /
@@ -73,7 +73,7 @@ void hashmap_init(hashmap_t *ht);
* @param value
* @return NULL
*/
void *hashmap_set(hashmap_t *ht, const void *key, void *value);
void *scc_hashtable_set(scc_hashtable_t *ht, const void *key, void *value);
/**
* @brief
@@ -81,7 +81,7 @@ void *hashmap_set(hashmap_t *ht, const void *key, void *value);
* @param key
* @return NULL
*/
void *hashmap_get(hashmap_t *ht, const void *key);
void *scc_hashtable_get(scc_hashtable_t *ht, const void *key);
/**
* @brief
@@ -91,7 +91,7 @@ void *hashmap_get(hashmap_t *ht, const void *key);
*
* @note
*/
void *hashmap_del(hashmap_t *ht, const void *key);
void *scc_hashtable_del(scc_hashtable_t *ht, const void *key);
/**
* @brief
@@ -99,17 +99,18 @@ void *hashmap_del(hashmap_t *ht, const void *key);
*
* @note key/value内存
*/
void hashmap_drop(hashmap_t *ht);
void scc_hashtable_drop(scc_hashtable_t *ht);
/**
* @typedef hashmap_iter_fn
* @typedef scc_hashtable_iter_fn
* @brief
* @param key
* @param value
* @param context
* @return 0
*/
typedef int (*hashmap_iter_fn)(const void *key, void *value, void *context);
typedef int (*scc_hashtable_iter_fn)(const void *key, void *value,
void *context);
/**
* @brief
@@ -117,6 +118,7 @@ typedef int (*hashmap_iter_fn)(const void *key, void *value, void *context);
* @param iter_func
* @param context
*/
void hashmap_foreach(hashmap_t *ht, hashmap_iter_fn iter_func, void *context);
void scc_hashtable_foreach(scc_hashtable_t *ht, scc_hashtable_iter_fn iter_func,
void *context);
#endif /* __SCC_HASHMAP_H__ */
#endif /* __SCC_HASHTABLE_H__ */

View File

@@ -8,8 +8,8 @@
#ifndef __SCC_STRPOOL_H__
#define __SCC_STRPOOL_H__
#include "hashmap.h"
#include <libcore.h>
#include "scc_hashtable.h"
#include <scc_core.h>
/**
* @struct strpool_t
@@ -18,14 +18,14 @@
*
*/
typedef struct strpool {
hashmap_t ht; /**< 哈希表用于快速查找已存储字符串 */
} strpool_t;
scc_hashtable_t ht; /**< 哈希表用于快速查找已存储字符串 */
} scc_strpool_t;
/**
* @brief
* @param pool
*/
void init_strpool(strpool_t *pool);
void scc_strpool_init(scc_strpool_t *pool);
/**
* @brief
@@ -36,7 +36,7 @@ void init_strpool(strpool_t *pool);
* @note
* @note
*/
const char *strpool_intern(strpool_t *pool, const char *str);
const char *scc_strpool_intern(scc_strpool_t *pool, const char *str);
/**
* @brief
@@ -45,6 +45,25 @@ const char *strpool_intern(strpool_t *pool, const char *str);
* @warning
* @note
*/
void strpool_destroy(strpool_t *pool);
void scc_strpool_drop(scc_strpool_t *pool);
/**
* @typedef scc_hashtable_iter_fn
* @brief
* @param key
* @param value
* @param context
* @return 0
*/
typedef int (*scc_strpool_iter_fn)(const char *key, char *value, void *context);
/**
* @brief
* @param ht
* @param iter_func
* @param context
*/
void scc_strpool_foreach(scc_strpool_t *pool, scc_strpool_iter_fn iter_func,
void *context);
#endif /* __SCC_STRPOOL_H__ */

View File

@@ -0,0 +1,9 @@
#ifndef __SMCC_UTILS_H__
#define __SMCC_UTILS_H__
#include "kllist.h"
#include "scc_hashtable.h"
#include "scc_strpool.h"
#include <scc_core.h>
#endif /* __SMCC_UTILS_H__ */

View File

@@ -1,10 +1,10 @@
#include <hashmap.h>
#include <scc_hashtable.h>
#ifndef SCC_INIT_HASHMAP_SIZE
#define SCC_INIT_HASHMAP_SIZE (32)
#endif
void hashmap_init(hashmap_t *ht) {
void scc_hashtable_init(scc_hashtable_t *ht) {
scc_vec_init(ht->entries);
ht->count = 0;
ht->tombstone_count = 0;
@@ -21,17 +21,18 @@ static int next_power_of_two(int n) {
return n + 1;
}
static hashmap_entry_t *find_entry(hashmap_t *ht, const void *key, u32 hash) {
static scc_hashtable_entry_t *find_entry(scc_hashtable_t *ht, const void *key,
u32 hash) {
if (ht->entries.cap == 0)
return NULL;
u32 index = hash & (ht->entries.cap - 1); // 容量是2的幂
u32 probe = 0;
hashmap_entry_t *tombstone = NULL;
scc_hashtable_entry_t *tombstone = NULL;
while (1) {
hashmap_entry_t *entry = &scc_vec_at(ht->entries, index);
scc_hashtable_entry_t *entry = &scc_vec_at(ht->entries, index);
if (entry->state == ENTRY_EMPTY) {
return tombstone ? tombstone : entry;
}
@@ -53,25 +54,27 @@ static hashmap_entry_t *find_entry(hashmap_t *ht, const void *key, u32 hash) {
return NULL;
}
static void adjust_capacity(hashmap_t *ht, int new_cap) {
static void adjust_capacity(scc_hashtable_t *ht, usize new_cap) {
new_cap = next_power_of_two(new_cap);
Assert(new_cap >= ht->entries.cap);
SCC_VEC(hashmap_entry_t) old_entries;
SCC_VEC(scc_hashtable_entry_t) old_entries;
old_entries.data = ht->entries.data;
old_entries.cap = ht->entries.cap;
// Not used size but for gdb python extention debug
ht->entries.size = new_cap;
ht->entries.cap = new_cap;
ht->entries.data = scc_realloc(NULL, new_cap * sizeof(hashmap_entry_t));
scc_memset(ht->entries.data, 0, new_cap * sizeof(hashmap_entry_t));
ht->entries.data =
scc_realloc(NULL, new_cap * sizeof(scc_hashtable_entry_t));
scc_memset(ht->entries.data, 0, new_cap * sizeof(scc_hashtable_entry_t));
// rehash the all of the old data
for (usize i = 0; i < old_entries.cap; i++) {
hashmap_entry_t *entry = &scc_vec_at(old_entries, i);
scc_hashtable_entry_t *entry = &scc_vec_at(old_entries, i);
if (entry->state == ENTRY_ACTIVE) {
hashmap_entry_t *dest = find_entry(ht, entry->key, entry->hash);
scc_hashtable_entry_t *dest =
find_entry(ht, entry->key, entry->hash);
*dest = *entry;
}
}
@@ -80,7 +83,7 @@ static void adjust_capacity(hashmap_t *ht, int new_cap) {
ht->tombstone_count = 0;
}
void *hashmap_set(hashmap_t *ht, const void *key, void *value) {
void *scc_hashtable_set(scc_hashtable_t *ht, const void *key, void *value) {
if (ht->count + ht->tombstone_count >= ht->entries.cap * 0.75) {
int new_cap = ht->entries.cap < SCC_INIT_HASHMAP_SIZE
? SCC_INIT_HASHMAP_SIZE
@@ -89,7 +92,7 @@ void *hashmap_set(hashmap_t *ht, const void *key, void *value) {
}
u32 hash = ht->hash_func(key);
hashmap_entry_t *entry = find_entry(ht, key, hash);
scc_hashtable_entry_t *entry = find_entry(ht, key, hash);
void *old_value = NULL;
if (entry->state == ENTRY_ACTIVE) {
@@ -107,21 +110,21 @@ void *hashmap_set(hashmap_t *ht, const void *key, void *value) {
return old_value;
}
void *hashmap_get(hashmap_t *ht, const void *key) {
void *scc_hashtable_get(scc_hashtable_t *ht, const void *key) {
if (ht->entries.cap == 0)
return NULL;
u32 hash = ht->hash_func(key);
hashmap_entry_t *entry = find_entry(ht, key, hash);
scc_hashtable_entry_t *entry = find_entry(ht, key, hash);
return (entry && entry->state == ENTRY_ACTIVE) ? entry->value : NULL;
}
void *hashmap_del(hashmap_t *ht, const void *key) {
void *scc_hashtable_del(scc_hashtable_t *ht, const void *key) {
if (ht->entries.cap == 0)
return NULL;
u32 hash = ht->hash_func(key);
hashmap_entry_t *entry = find_entry(ht, key, hash);
scc_hashtable_entry_t *entry = find_entry(ht, key, hash);
if (entry == NULL || entry->state != ENTRY_ACTIVE)
return NULL;
@@ -133,17 +136,18 @@ void *hashmap_del(hashmap_t *ht, const void *key) {
return value;
}
void hashmap_drop(hashmap_t *ht) {
void scc_hashtable_drop(scc_hashtable_t *ht) {
scc_vec_free(ht->entries);
ht->count = 0;
ht->tombstone_count = 0;
}
void hashmap_foreach(hashmap_t *ht, hashmap_iter_fn iter_func, void *context) {
void scc_hashtable_foreach(scc_hashtable_t *ht, scc_hashtable_iter_fn iter_func,
void *context) {
for (usize i = 0; i < ht->entries.cap; i++) {
hashmap_entry_t *entry = &scc_vec_at(ht->entries, i);
scc_hashtable_entry_t *entry = &scc_vec_at(ht->entries, i);
if (entry->state == ENTRY_ACTIVE) {
if (!iter_func(entry->key, entry->value, context)) {
if (iter_func(entry->key, entry->value, context)) {
break; // enable callback function terminal the iter
}
}

View File

@@ -0,0 +1,32 @@
#include <scc_strpool.h>
void scc_strpool_init(scc_strpool_t *pool) {
pool->ht.hash_func = (u32 (*)(const void *))scc_strhash32;
pool->ht.key_cmp = (int (*)(const void *, const void *))scc_strcmp;
scc_hashtable_init(&pool->ht);
}
const char *scc_strpool_intern(scc_strpool_t *pool, const char *str) {
void *existing = scc_hashtable_get(&pool->ht, str);
if (existing) {
return existing;
}
usize len = scc_strlen(str) + 1;
char *new_str = scc_malloc(len);
if (!new_str) {
LOG_ERROR("strpool: Failed to allocate memory for string");
return NULL;
}
scc_memcpy(new_str, str, len);
scc_hashtable_set(&pool->ht, new_str, new_str);
return new_str;
}
void scc_strpool_drop(scc_strpool_t *pool) { scc_hashtable_drop(&pool->ht); }
void scc_strpool_foreach(scc_strpool_t *pool, scc_strpool_iter_fn iter_func,
void *context) {
scc_hashtable_foreach(&pool->ht, (scc_hashtable_iter_fn)iter_func, context);
}

View File

@@ -608,9 +608,10 @@ class GccCompiler(Compiler):
flags = {
BuildMode.TEST: [
"-DTEST_MODE",
"-O2",
"-O0",
"-g",
"--coverage",
"-fprofile-update=atomic",
"-Wall",
"-Wextra",
],
@@ -647,9 +648,10 @@ class ClangCompiler(Compiler):
flags = {
BuildMode.TEST: [
"-DTEST_MODE",
"-O2",
"-O0",
"-g",
"--coverage",
"-fprofile-update=atomic",
"-Wall",
"-Wextra",
],
@@ -861,7 +863,7 @@ class PackageBuilder:
"""打印依赖树"""
self.context.resolver.print_tree()
def tests(self, filter_str: str = ""):
def tests(self, filter_str: str = "", timeout: int = 30):
"""运行测试"""
targets = [
t for t in self.context.get_targets() if t.type == TargetType.TEST_EXEC
@@ -888,7 +890,7 @@ class PackageBuilder:
for target in targets:
logger.info("运行测试: %s", target.name)
try:
result = subprocess.run(target.output, check=True, timeout=30)
result = subprocess.run(target.output, check=True, timeout=timeout)
if result.returncode == 0:
print(f" ✓ 测试 {target.name} 通过")
passed += 1
@@ -1064,13 +1066,16 @@ def create_parser():
# test 命令
test_parser = subparsers.add_parser("test", help="运行测试")
add_common_args(test_parser)
test_parser.add_argument("--timeout", "-t", type=int, default=3, help="测试时间")
test_parser.add_argument("--filter", default="", help="过滤测试")
test_parser.set_defaults(mode=BuildMode.TEST)
# clean 命令
clean_parser = subparsers.add_parser("clean", help="清理构建产物")
add_common_args(clean_parser)
clean_parser.add_argument("--all", action="store_true", help="清理所有模式")
clean_parser.add_argument(
"-a", "--all", action="store_true", default=True, help="清理所有模式"
)
# tree 命令
tree_parser = subparsers.add_parser("tree", help="显示依赖树")
@@ -1131,7 +1136,7 @@ def main():
builder.run()
elif args.command == "test":
builder.build([TargetType.TEST_EXEC])
builder.tests(getattr(args, "filter", ""))
builder.tests(args.filter, args.timeout)
elif args.command == "clean":
if hasattr(args, "all") and args.all:
# 清理所有模式

53
tools/wc.py Normal file
View File

@@ -0,0 +1,53 @@
"""统计目录下C/C++文件的行数(write by AI)"""
import os
def count_lines(file_path):
"""统计单个文件的代码行数"""
try:
with open(file_path, "rb") as f: # 二进制模式读取避免编码问题
return sum(1 for _ in f)
except UnicodeDecodeError:
print(f"警告:无法解码文件 {file_path}(可能不是文本文件)")
return 0
except Exception as e:
print(f"读取 {file_path} 出错: {str(e)}")
return 0
def scan_files(directory, exclude_dirs=None):
"""扫描目录获取所有C/C++文件"""
if exclude_dirs is None:
exclude_dirs = [".git", "venv", "__pycache__", ".old"] # 默认排除的目录
c_files = []
for root, dirs, files in os.walk(directory):
# 跳过排除目录
dirs[:] = [d for d in dirs if d not in exclude_dirs]
for file in files:
if file.endswith((".c", ".h")):
full_path = os.path.join(root, file)
c_files.append(full_path)
return c_files
def main():
"""main function"""
target_dir = input("请输入要扫描的目录路径(留空为当前目录): ") or "."
files = scan_files(target_dir)
total_lines = 0
print("\n统计结果:")
for idx, file in enumerate(files, 1):
lines = count_lines(file)
total_lines += lines
print(f"{idx:4d}. {file} ({lines} 行)")
print(f"\n总计: {len(files)} 个C/C++文件,共 {total_lines} 行代码")
if __name__ == "__main__":
main()