Coding
—
ROCO Bench
- LLM generate path points
- using RRT
—
Humanoid Bench
- env
- two agent mujoco model
- reward, observation space, action space
- train
- tdmpc2
- ppo
Paper Reading
CoT
—
basic CoT
- manually add prompt
- make LLM thinking step by step
—
zero-shot CoT
- add prompt:
"Let's think step by step"
- pros: simple, zero-shot
- cons: bad performance
—
AutoCoT
- use BERT, cluster
- kmeans
- example guided
—
ToT
- Thought Decomposition
- Thought Generator
- Sample
- Propose
- State Evaluator
- Value
- Vote
—
GoT
::: block
- find unreasonable choices
- analyze why unreasonable with context
- find the best choice
:::
—
Chain of Draft
::: block Think step by step, but only keep a minimum draft for each thinking step, with 5 words at most. Return the answer at the end of the response after a separator . :::
A-MEM
graph TB a[Environment] b[LLM Agent] c[Agentic] d[Memory] a-->|Interaction|b b-->|Interaction|a b-->|Write|c c-->|Read|b c-->d d-->c c-->d d-->c
—
Method
::: block
- Zettelkasten Method
- Link Generation
- Memory Evolution
- Retrieve Relative Memory
:::
Preserving and combining knowledge in robotic lifelong reinforcement learning
—
—
DPMM
—
Variational Inference
—
Metrics
- average success rate
- forgetting
- forward transfer
- Improvement of few-shot knowledge recall