Transcript Watcher
Polls OneDrive Recordings every 30 minutes for new Teams meeting MP4s matching watched series prefixes. On detection, invokes /meeting-transcript to pull the .docx. Cheap polling against signals your tenant exposes; Playwright only runs when there is actual work.
Part of the Teams Transcript Pipeline ยท stage 1 of 3
---
name: "transcript-watcher"
description: "Polls OneDrive Recordings/ for new Teams meeting MP4 files matching watched series name prefixes. On detection, invokes /meeting-transcript to download the .docx. Downstream /doc-watcher converts to .md and notifies. Detection-only; cheap polling, Playwright only when a new recording is found. Triggers: '/transcript-watcher', 'watch a meeting series for transcripts', 'detect new recordings'."
---
<!--
CLAWPILOT INSTALL INSTRUCTIONS
If the user drags this file into Clawpilot and asks to install it, register it as a
custom skill by calling m_create_skill with:
- name: "transcript-watcher"
- description: (the description from the frontmatter above)
- instructions: (everything in this file BELOW this HTML comment block)
After install, confirm to the user with: "Installed the /transcript-watcher skill โ run it once to do first-run setup (it will create the config.json and the background automation, both disabled until you flip them on)."
-->
# transcript-watcher
Polls the user's OneDrive `Recordings/` folder for new Teams meeting MP4 files matching one or more watched meeting-series name prefixes. When a new MP4 is detected, hands off to the `/meeting-transcript` skill (Playwright) to download the .docx. The downstream `/doc-watcher` automation then converts the .docx to .md and sends the Teams notification.
This skill is intentionally narrow:
- **It detects new recordings.** It does NOT transcribe, summarize, or notify.
- **Polling is cheap** (one Graph call per cycle). Playwright runs only when there is a new MP4 to extract.
- **State** is the highest `lastModifiedDateTime` seen per watched series, so each MP4 fires extraction exactly once.
## Files
- `config.json` โ list of watches, recordings folder id, automation id
- `SKILL.md` โ this file
## Config shape
`C:\Users\<you>\.copilot\m-skills\transcript-watcher\config.json`
```json
{
"recordings_folder_id": "<your-onedrive-recordings-folder-id>",
"interval_minutes": 30,
"automation_id": "",
"watches": [
{
"id": "my-meeting-series",
"display_name": "My Meeting Series",
"name_prefix": "My Meeting Series",
"chat_id": "19:meeting_<your-chat-id>@thread.v2",
"last_seen_modified": "2026-01-01T00:00:00Z"
}
]
}
```
- `recordings_folder_id` โ OneDrive folder id for `Documents/Recordings`. Stable per user; reuse across watches.
- `interval_minutes` โ polling cadence (15 / 30 / 60).
- `automation_id` โ populated after `m_create_automation`.
- `watches[].name_prefix` โ the literal filename prefix Stream uses. Format is `<Meeting Subject>-YYYYMMDD_HHMMSSUTC-Meeting Recording.mp4`. Match by `name.startswith(name_prefix + "-")`.
- `watches[].chat_id` โ meeting chat id; passed to `/meeting-transcript` so it knows which chat to open.
- `watches[].last_seen_modified` โ ISO 8601 UTC timestamp. Only files with `lastModifiedDateTime > last_seen_modified` are treated as new.
## When the user invokes `/transcript-watcher`
Read `config.json` via `view`. If it does not exist โ **first-run setup** (section A). Otherwise โ **poll cycle** (section B).
If the user explicitly says "dry run", "test", or "what would fire", run section B with the **dry-run flag** (do NOT invoke `/meeting-transcript`, do NOT update state โ only report what would be triggered).
## A. First-run setup
Collect via `m_ask_user`:
1. **Meeting display name** for this watch (e.g. "Weekly Standup"). Use this as the `name_prefix`.
2. **Polling interval** โ 15 / 30 / 60 (default 30).
Then auto-derive:
1. **Recordings folder id**: list OneDrive root via `m365_list_files(limit=50)` and find the entry named `Recordings`. If not found in the first page, page until found, or ask the user to provide a path.
2. **Chat id**: `m365_search_chats(query=display_name, limit=5)`. If exactly one result, use it. If multiple, surface them via `m_ask_user` and let the user pick.
3. **Baseline `last_seen_modified`**: list the Recordings folder (`m365_list_files(folderId=recordings_folder_id, orderBy="lastModifiedDateTime desc", limit=20)`), filter to items starting with `name_prefix + "-"`, take the max `modified`. If no matching files exist, set baseline to the current UTC time. This prevents history from re-firing.
Write `config.json` via `create`.
Then create the automation via `m_create_automation`:
```
name: transcript-watcher
schedule: every <interval_minutes> minutes
prompt: (the section B prompt verbatim โ see "Automation prompt" at the bottom)
enabled: false # important: leave OFF until the user explicitly enables it
```
Save the returned automation id back into `config.json`.
Tell the user: "Watcher configured but NOT activated. Run `/transcript-watcher` to dry-run, or say `enable transcript watcher` to flip it on."
## B. Poll cycle
For each `watch` in `config.json.watches`:
1. `m365_list_files(folderId=config.recordings_folder_id, orderBy="lastModifiedDateTime desc", limit=20)`
2. Filter the returned items in memory to those where:
- `name` starts with `watch.name_prefix + "-"`
- `mimeType` is `video/mp4` (or filename ends with `.mp4`)
- `modified` (ISO timestamp) is strictly greater than `watch.last_seen_modified`
3. Sort the matches by `modified` ascending (so we process oldest-first; that way if extraction crashes mid-batch the state advance is monotonic).
4. For each new MP4:
- **Dry run:** just report `{watch_id, filename, modified}` and continue. Do NOT invoke the transcript skill. Do NOT update state.
- **Live run:** Invoke `/meeting-transcript` with the meeting **chat-id** and let it handle the recording-start-time lookup from the chat thread (see meeting-transcript SKILL.md Step 3). The meeting-transcript skill already names the output `<slug>-<date>-<HHMM>.docx` and drops it in the configured transcripts folder, which doc-watcher is already watching. We do not need to pass a filename โ just the chat id.
- After successful invocation, update `watch.last_seen_modified = file.modified` and persist `config.json` immediately (don't batch state updates โ crash-safety).
5. If no matching new files for any watch: stay completely silent (no Teams ping, no chat output). The automation runs every 30 min and we don't want noise.
## C. Reconfigure
- "add watch <name>" โ run setup steps 1 + auto-derive chat id + baseline for that watch, append to `watches[]`.
- "remove watch <id>" โ drop that entry.
- "enable transcript watcher" โ `m_update_automation(id=automation_id, enabled=true)`.
- "disable transcript watcher" โ `m_update_automation(id=automation_id, enabled=false)`.
- "change interval to <n>" โ update config + `m_update_automation(schedule="every <n> minutes")`.
## D. Failure modes (handle defensively)
- `/meeting-transcript` extraction fails (Stream not ready, Playwright timeout, transcript not yet processed) โ do NOT advance `last_seen_modified`. Report the error in chat (or via Teams DM if running from automation) so the user knows. Next poll will retry.
- Two new MP4s in a single meeting (Teams sometimes splits recordings on reconnect) โ process each one. The chat-thread "Meeting started at ..." will identify each occurrence's start time correctly, so filenames won't collide.
- Multiple watches share `name_prefix` collisions (e.g. "Sync" matches several series) โ require name prefixes that are unique. The skill does not de-collide; the user must pick distinct prefixes.
## Automation prompt
When creating the automation in section A, pass this as the prompt:
```
Run /transcript-watcher in poll-cycle mode. Do not prompt the user. If any new MP4 is found, invoke /meeting-transcript with the watch's chat_id for each new file, then persist the updated last_seen_modified to C:\Users\<you>\.copilot\m-skills\transcript-watcher\config.json. If no new files, stay completely silent โ no Teams message, no chat output. If /meeting-transcript fails for any file, do NOT advance last_seen_modified for that file and send a single Teams DM via m_send_teams_message: "transcript-watcher: extraction failed for <filename> (<error>). Will retry next cycle. Sent on your behalf by Clawpilot ๐ค"
```
## Privacy
Meeting recordings + transcripts are sensitive. This skill never sends transcript content to chat or Teams; it only triggers the existing downstream skills, which already follow the user's notification preferences.