MiroFish/backend/README.md
666ghj c60e6e1089 Refactor project creation process in API documentation and code
- Updated README.md to reflect the removal of the project creation endpoint, adjusting the workflow steps accordingly.
- Removed the `create_project` function from graph.py, streamlining the project management API by eliminating deprecated functionality.
2025-11-29 00:47:54 +08:00

239 lines
5.4 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# MiroFish Backend
社会舆论模拟系统后端服务基于Flask框架。
## 项目结构
```
backend/
├── app/
│ ├── __init__.py # Flask应用工厂
│ ├── config.py # 配置管理
│ ├── api/ # API路由
│ │ ├── __init__.py
│ │ └── graph.py # 图谱相关接口
│ ├── services/ # 业务逻辑层
│ │ ├── ontology_generator.py # 本体生成服务
│ │ ├── graph_builder.py # 图谱构建服务
│ │ └── text_processor.py # 文本处理服务
│ ├── models/ # 数据模型
│ │ ├── task.py # 任务状态管理
│ │ └── project.py # 项目上下文管理
│ └── utils/ # 工具模块
│ ├── file_parser.py # 文件解析
│ └── llm_client.py # LLM客户端
├── requirements.txt
└── run.py # 启动入口
```
## 安装
```bash
conda activate MiroFish
cd backend
pip install -r requirements.txt
```
## 配置
在项目根目录 `MiroFish/.env` 中配置:
```bash
# LLM配置
LLM_API_KEY=your-llm-api-key
LLM_BASE_URL=https://openrouter.ai/api/v1
LLM_MODEL_NAME=gpt-4o-mini
# Zep配置
ZEP_API_KEY=your-zep-api-key
```
## 启动服务
```bash
python run.py
```
服务默认运行在 http://localhost:5000
---
## API接口
### 核心工作流程
```
1. 上传文件 + 生成本体接口1
POST /api/graph/ontology/generate
→ 自动创建项目,返回 project_id
2. 构建图谱接口2
POST /api/graph/build
→ 传入 project_id
→ 返回 task_id
3. 查询任务进度
GET /api/graph/task/{task_id}
4. 获取图谱数据
GET /api/graph/data/{graph_id}
```
---
### 接口1生成本体定义
**POST** `/api/graph/ontology/generate`
上传文档,分析生成适合社会模拟的实体和关系类型定义。
**请求form-data**
| 字段 | 类型 | 必填 | 说明 |
|------|------|------|------|
| `files` | File | ✅ | PDF/MD/TXT文件可多个 |
| `simulation_requirement` | Text | ✅ | 模拟需求描述 |
| `project_name` | Text | ❌ | 项目名称 |
| `additional_context` | Text | ❌ | 额外说明 |
**响应示例:**
```json
{
"success": true,
"data": {
"project_id": "proj_abc123def456",
"project_name": "武汉大学舆情分析",
"ontology": {
"entity_types": [
{
"name": "Person",
"description": "Individuals who can express opinions",
"attributes": [...]
}
],
"edge_types": [
{
"name": "AFFILIATED_WITH",
"description": "Indicates affiliation",
"source_targets": [...]
}
]
},
"analysis_summary": "分析说明...",
"files": [
{"filename": "报告.pdf", "size": 123456}
],
"total_text_length": 20833
}
}
```
---
### 接口2构建图谱
**POST** `/api/graph/build`
根据 `project_id` 构建Zep知识图谱异步任务
**请求JSON**
```json
{
"project_id": "proj_abc123def456",
"graph_name": "图谱名称",
"chunk_size": 500,
"chunk_overlap": 50
}
```
| 字段 | 类型 | 必填 | 说明 |
|------|------|------|------|
| `project_id` | string | ✅ | 来自接口1的返回 |
| `graph_name` | string | ❌ | 图谱名称 |
| `chunk_size` | int | ❌ | 文本块大小默认500 |
| `chunk_overlap` | int | ❌ | 块重叠字符默认50 |
**响应:**
```json
{
"success": true,
"data": {
"project_id": "proj_abc123def456",
"task_id": "task_xyz789",
"message": "图谱构建任务已启动"
}
}
```
---
### 任务状态查询
**GET** `/api/graph/task/{task_id}`
```json
{
"success": true,
"data": {
"task_id": "task_xyz789",
"status": "processing",
"progress": 45,
"message": "添加文本块 (15/30)...",
"result": null
}
}
```
**状态值:**
- `pending` - 等待中
- `processing` - 处理中
- `completed` - 已完成
- `failed` - 失败
---
### 项目管理接口
| 方法 | 路径 | 说明 |
|------|------|------|
| GET | `/api/graph/project/{project_id}` | 获取项目详情 |
| GET | `/api/graph/project/list` | 列出所有项目 |
| DELETE | `/api/graph/project/{project_id}` | 删除项目 |
---
### 图谱数据接口
| 方法 | 路径 | 说明 |
|------|------|------|
| GET | `/api/graph/data/{graph_id}` | 获取图谱节点和边 |
| DELETE | `/api/graph/delete/{graph_id}` | 删除Zep图谱 |
---
## 实体设计原则
本系统专为社会舆论模拟设计,实体必须是:
**✅ 可以是:**
- 具体的个人(有名有姓)
- 注册的公司、组织、机构
- 媒体机构
- 政府部门
**❌ 不可以是:**
- 抽象概念(如"技术"、"创新"
- 情绪、观点、趋势
- 泛指的群体(如"用户"、"消费者"
这是因为后续需要模拟各实体对舆论的反应和传播,抽象概念无法参与这种模拟。
---
## 项目状态流转
```
created → ontology_generated → graph_building → graph_completed
failed
```