学术研究图谱_academic-research-mapper-拓冰网站优化

以下为本文档的中文说明该技能用于绘制任何技术或学术主题的研究领域图谱。它通过搜索arXiv、Semantic Scholar等学术数据库系统性地收集和分析相关文献识别研究趋势、关键论文、主要研究者和机构合作关系。该技能自动构建主题的知识结构图谱展示研究方向的分支脉络和演进路径。适用于研究生、科研人员和学术新手需要快速了解一个研究领域的全貌。通过自动化文献检索和分析大大缩短了文献调研的时间周期帮助研究人员在论文撰写、课题立项或研究方向选择时获得全面的文献基础支持。该技能提供了详细的操作指南和最佳实践帮助用户快速上手并深入掌握。通过系统的功能模块划分和丰富的应用场景说明用户可以在实际项目中有效运用该技能提升工作效率。该技能注重实用性和可操作性涵盖从基础配置到高级功能的完整知识体系满足不同层次用户的学习需求。持续更新和优化的内容确保用户始终能够接触到最新的技术发展和行业实践。通过此技能的学习和应用用户可以减少摸索时间快速获得可用的解决方案将精力集中在核心业务逻辑和创新工作上从而在技术快速迭代的环境中保持竞争力。该技能的模块化设计使其易于扩展和定制用户可以根据自身需求灵活调整应用方式实现最大化的价值产出。该技能整合了常见的设计模式和最佳实践提供了清晰的学习路径和参考资料帮助用户在短时间内建立起完整的知识框架并有能力在实际项目中灵活运用所学内容解决问题。Research Landscape Mapper — Understand a Field Before You Build or WriteYou have access to the TinyFish CLI (tinyfish), a tool that runs browser automations from the terminal using natural language goals. This skill uses it to search arXiv, Semantic Scholar, and Google Scholar in parallel, then synthesizes results into a structured landscape report with identified gaps.Pre-flight Check (REQUIRED)Before making any TinyFish call, always run BOTH checks:1. CLI installed?bash/zsh:whichtinyfishtinyfish--version||echoTINYFISH_CLI_NOT_INSTALLEDPowerShell:Get-Commandtinyfish;tinyfish--versionIf not installed, stop and tell the user:Install the TinyFish CLI:npm install -g tiny-fish/cli2. Authenticated?tinyfish auth statusIf not authenticated, stop and tell the user:You need a TinyFish API key. Get one at: https://agent.tinyfish.ai/api-keysThen authenticate:Option 1 — CLI login (interactive):tinyfish auth loginOption 2 — bash/zsh (Mac/Linux, current session):exportTINYFISH_API_KEYyour-api-key-hereOption 3 — bash/zsh (persist across sessions, add to ~/.bashrc or ~/.zshrc):echoexport TINYFISH_API_KEYyour-api-key-here~/.zshrcsource~/.zshrcOption 4 — PowerShell (current session only):$env:TINYFISH_API_KEYyour-api-key-hereOption 5 — Claude Code settings:Add to~/.claude/settings.local.json:{env:{TINYFISH_API_KEY:your-api-key-here}}Do NOT proceed until both checks pass.What This Skill DoesGiven a research topic (e.g.“retrieval-augmented generation”or“protein structure prediction”), this skill:SearchesarXivfor preprints sorted by most recent — capturing what is being worked on right nowSearchesSemantic Scholarfor papers ranked by relevance with citation counts — identifying what the field considers importantSearchesGoogle Scholarfor broad coverage including published venues not yet on arXivIt then deduplicates across all three sources by title similarity, clusters papers into subtopics, and synthesizes findings into a structured landscape report: what is well-studied, what is emerging, and where the gaps are.Core Commandtinyfish agent run--urlurlgoalFlagsFlagPurpose--url urlTarget website URL for the agent to navigate--syncWait for the full result before returning (required when you need output before next step)--asyncSubmit and return a run ID immediately — use when firing parallel agents--prettyHuman-readable formatted output for debuggingKeyword StrategyThe quality of results depends entirely on your search terms. Before running anything, derive 2–3 keyword variants from the topic. Each source has different vocabulary norms — academic terms work best on Semantic Scholar, shorter compressed terms work best on arXiv.TopicPrimary keywordsVariant AVariant BRetrieval-augmented generationretrieval augmented generationRAG language modeldense retrieval QAProtein structure predictionprotein structure predictionAlphaFold protein foldingab initio structure biologyNeural architecture searchneural architecture searchNAS automated machine learninghyperparameter optimization deep learningFederated learning privacyfederated learningfederated learning differential privacydistributed training privacyUse the primary keywords for the first parallel pass. If any source returns fewer than 5 results, run a second pass with the variant keywords on that source only.Step-by-Step WorkflowStep 1 — Derive keywords and build URLsBefore running any agents, construct all three search URLs. Do this in your head or in a scratch note — do not make TinyFish calls yet.arXiv URL pattern:https://arxiv.org/search/?querykeywordssearchtypeallorder-announced_date_firstSemantic Scholar URL pattern:https://www.semanticscholar.org/search?qkeywordssortRelevanceGoogle Scholar URL pattern:https://scholar.google.com/scholar?qkeywordsas_sdt0%2C5hlenReplacekeywordswith URL-encoded primary keywords (spaces become).Step 2 — Search all three sources in parallelFire all three agents simultaneously. Do NOT wait for one to finish before starting the next.arXiv — sorted by most recent:tinyfish agent run--sync\\--urlhttps://arxiv.org/search/?queryretrievalaugmentedgenerationsearchtypeallorder-announced_date_first\\Extract the top 15 search results as JSON: [{\\title\\: str,\\authors\\: [str],\\year\\: str,\\abstract_snippet\\: str (first 150 chars of abstract),\\arxiv_id\\: str,\\url\\: str}]. If a result has no year visible, use the submission date year.Semantic Scholar — sorted by relevance with citation counts:tinyfish agent run--sync\\--urlhttps://www.semanticscholar.org/search?qretrievalaugmentedgenerationsortRelevance\\Extract the top 15 search results as JSON: [{\\title\\: str,\\authors\\: [str],\\year\\: str,\\citation_count\\: str,\\venue\\: str,\\abstract_snippet\\: str (first 150 chars),\\url\\: str}]. Scroll down to load more results if fewer than 10 are visible.Google Scholar — broad coverage:tinyfish agent run--sync\\--urlhttps://scholar.google.com/scholar?qretrievalaugmentedgenerationas_sdt0%2C5hlen\\Extract the top 15 search results as JSON: [{\\title\\: str,\\authors\\: [str],\\year\\: str,\\citation_count\\: str,\\venue\\: str,\\snippet\\: str,\\url\\: str}]. Citation count appears after Cited by — extract that number.Parallel ExecutionAll three source searches are fully independent. Always fire them simultaneously.Good — parallel calls (fire and wait):tinyfish agent run--sync\\--urlhttps://arxiv.org/search/?queryretrievalaugmentedgenerationsearchtypeallorder-announced_date_first\\Extract the top 15 results as JSON: [{\\title\\: str,\\authors\\: [str],\\year\\: str,\\abstract_snippet\\: str,\\arxiv_id\\: str,\\url\\: str}]/tmp/arxiv_results.jsontinyfish agent run--sync\\--urlhttps://www.semanticscholar.org/search?qretrievalaugmentedgenerationsortRelevance\\Extract the top 15 results as JSON: [{\\title\\: str,\\authors\\: [str],\\year\\: str,\\citation_count\\: str,\\venue\\: str,\\abstract_snippet\\: str,\\url\\: str}]/tmp/s2_results.jsontinyfish agent run--sync\\--urlhttps://scholar.google.com/scholar?qretrievalaugmentedgenerationas_sdt0%2C5hlen\\Extract the top 15 results as JSON: [{\\title\\: str,\\authors\\: [str],\\year\\: str,\\citation_count\\: str,\\venue\\: str,\\snippet\\: str,\\url\\: str}]/tmp/scholar_results.jsonwaitechoAll three sources complete.Bad — sequential calls:# Do NOT do this — triples the wait time for no benefittinyfish agent run--urlhttps://arxiv.org/...search arxiv, then also search semantic scholar, then also search google scholarEach source is always its own separate call. Never combine them into one goal.Step 3 — Handle sparse results (if needed)After the parallel run completes, check each result set. If any source returned fewer than 5 papers, run a second pass on that source with variant keywords:# Example: arXiv returned only 3 results for primary keywordstinyfish agent run--sync\\--urlhttps://arxiv.org/search/?queryRAGlanguagemodelsearchtypeallorder-announced_date_first\\Extract the top 15 results as JSON: [{\\title\\: str,\\authors\\: [str],\\year\\: str,\\abstract_snippet\\: str,\\arxiv_id\\: str,\\url\\: str}]Do not run second passes if the primary pass was already rich — this wastes steps.Step 4 — Synthesize into a Landscape ReportOnce all three sources have returned results, synthesize findings into this structure. Use only data that TinyFish actually returned — do not hallucinate paper titles, citation counts, or author names.## Research Landscape: topic ### Volume Coverage - arXiv: N papers found, most recent: year - Semantic Scholar: N papers found, highest citations: N (paper title) - Google Scholar: N papers found - Unique papers after deduplication: N ### Key Papers (sorted by citation count) 1. Title — Authors, Year, Venue if known — citation_count citations one-sentence summary from abstract snippet 2. ... (list top 8–10 unique papers) ### Active Subtopics Cluster the papers by what they are actually about. Label each cluster with a short name. - **Subtopic A**: N papers — 1-sentence description of what this cluster covers - **Subtopic B**: N papers — ... - **Subtopic C**: N papers — ... ### Key Authors Groups - Author name — N papers in results, affiliated with institution if visible - ... (list authors appearing 2 times across the results) ### Recency Signal - Papers from last 12 months: N - Papers from last 3 years: N - Oldest paper in results: year - Trend: accelerating / stable / declining (infer from year distribution) ### Gaps Open Directions Based on what the papers cover and what they do not: - **Gap 1**: specific thing that is missing or underexplored - **Gap 2**: ... - **Gap 3**: ... ### Landscape Verdict 2–3 sentences: is this field crowded or open, mature or nascent, dominated by a few groups or distributed, and what is the single most underexplored angle?Deduplication RulesPapers appear across multiple sources. Before synthesizing, deduplicate using these rules in order:Exact title match(case-insensitive) → keep one, prefer the Semantic Scholar entry (has citation count)Title similarity 85%(same words, different punctuation) → treat as the same paperSame arXiv ID→ always the same paper regardless of title variationIf unsure, keep both and note the possible duplicate in the reportSubtopic Clustering GuideGroup papers by reading their abstract snippets, not just their titles. Common cluster patterns:If papers discuss…Cluster labelBenchmarks, evaluation datasets, metrics“Evaluation benchmarks”New model architectures or training methods“Model architecture”Application to a specific domain (medical, legal, code)“Domain adaptation: ”Efficiency, speed, compression, cost“Efficiency scaling”Safety, alignment, robustness, hallucination“Safety reliability”Surveys, meta-analyses, overviews“Surveys overviews”A paper can belong to at most two clusters. Name the clusters based on what you actually see, not these defaults if the topic warrants different ones.Managing Runs# List recent runs (useful if a run takes longer than expected)tinyfish agent run list# Get the full output of a specific run by IDtinyfish agent run getrun_id# Cancel a run that is taking too longtinyfish agent run cancelrun_idOutput FormatThe CLI streamsdata: {...}SSE lines by default. The final usable result is the event wheretype COMPLETEandstatus COMPLETED— the extracted data is in theresultJsonfield. Read the raw output directly; no script-side parsing is required.When saving to files withredirection as shown in the parallel example, the full SSE stream is saved. Extract the JSON by looking for the last line containingCOMPLETEDand parsing theresultJsonvalue from it.Example: Full Run for “Mixture of Experts”# Step 1 — fire all three in paralleltinyfish agent run--sync\\--urlhttps://arxiv.org/search/?querymixtureofexpertstransformersearchtypeallorder-announced_date_first \\Extract top 15 results as JSON: [{\\title\\: str,\\authors\\: [str],\\year\\: str,\\abstract_snippet\\: str,\\arxiv_id\\: str,\\url\\: str}]\\/tmp/moe_arxiv.jsontinyfish agent run--sync\\--urlhttps://www.semanticscholar.org/search?qmixtureofexpertstransformersortRelevance\\Extract top 15 results as JSON: [{\\title\\: str,\\authors\\: [str],\\year\\: str,\\citation_count\\: str,\\venue\\: str,\\abstract_snippet\\: str,\\url\\: str}]\\/tmp/moe_s2.jsontinyfish agent run--sync\\--urlhttps://scholar.google.com/scholar?qmixtureofexpertsLLMas_sdt0%2C5hlen\\Extract top 15 results as JSON: [{\\title\\: str,\\authors\\: [str],\\year\\: str,\\citation_count\\: str,\\venue\\: str,\\snippet\\: str,\\url\\: str}]\\/tmp/moe_scholar.jsonwait# Step 2 — synthesize# Read /tmp/moe_arxiv.json, /tmp/moe_s2.json, /tmp/moe_scholar.json# Deduplicate → cluster → produce landscape report

学术研究图谱_academic-research-mapper

相关新闻

cyancat-开源数据库管理工具

CIO方法论15_数智化商业模式创新_从效率提升到价值创造

M68HC16系统保护机制：看门狗、总线监控与哨兵设计实战

通义实验室推出首个统一“科学语法”的多领域科学生成基础模型 LOGOS

Steam挂刀行情站：构建专业级饰品交易数据监控系统

OpenXR-Toolkit：智能渲染优化与模块化VR应用性能提升解决方案

pg_durable绿色计算：节能工作流调度算法的终极指南

如何安装和配置 NotCPUCores：从零开始的 10 分钟快速上手教程

终极指南：3步解决小爱音箱音乐服务的设备识别难题

ZigBee HA智能家居开发实战：从集群模型到NXP JN516x代码实现

Java毕设选题推荐：基于 Spring Boot 的个人随笔博客运维管理系统的设计与实现基于 Spring Boot 的用户原创博客分享社区【附源码、mysql、文档、调试+代码讲解+全bao等】

JN517x嵌入式开发实战：看门狗、脉冲计数器与I2C接口的深度解析与避坑指南