---
name: "meeting-transcript"
description: "Download the transcript of a recorded Teams meeting from its meeting chat. Resolves the chat to its online meeting and saves the transcript (.docx by default, .vtt optional) to a local folder. Triggers: 'get the transcript', 'download the meeting transcript', 'transcript for this meeting', or pasting a Teams meeting chat link and asking for the transcript."
---

<!--
CLAWPILOT INSTALL INSTRUCTIONS

If the user drags this file into Clawpilot and asks to install it, register it as a
custom skill by calling m_create_skill with:
  - name: "meeting-transcript"
  - description: (the description from the frontmatter above)
  - instructions: (everything in this file BELOW this HTML comment block)

Before first use, the user MUST replace the two {{...}} placeholders in the
Configuration section below with their own values. After install, confirm with:
"Installed the /meeting-transcript skill — open SKILL.md and fill in DEFAULT_OUTPUT_DIR
and TENANT_ID before first invoking it."
-->

# meeting-transcript

Download the full transcript of a recorded Teams meeting as a `.docx` (default) or `.vtt` by automating the Stream Transcript pane's native Download button.

> **Before first use:** open this file and replace the two `{{...}}` placeholders in the **Configuration** section below with your own values. Everything else works out of the box.

## Configuration (fill these in before first use)

- **`{{DEFAULT_OUTPUT_DIR}}`** — Absolute path to the folder where transcripts should be saved by default. Example: `C:\Users\<you>\Documents\MeetingTranscripts\`. The skill will create it if it doesn't exist.
- **`{{TENANT_ID}}`** — Your Microsoft 365 tenant GUID (the value Teams uses in `?tenantId=...` URLs). Find it at <https://portal.azure.com> → Microsoft Entra ID → Overview → Tenant ID, or by opening any Teams deep link in your browser and copying the `tenantId` query parameter.

## When to use
- User asks for the transcript / notes of a recorded Teams meeting they pasted a link, subject, or chat to.
- User wants the raw text of what was said in a past Teams call.
- Invoked programmatically by `/transcript-watcher` when a new MP4 lands in OneDrive Recordings.

## Inputs
- A Teams meeting link (`/meet/...`, `/l/meetup-join/...`, or `/l/chat/...`), OR
- A meeting subject (e.g. "Weekly Standup"), OR
- A chat ID (`19:meeting_<...>@thread.v2`).
- Optional format override: `vtt` (default is `docx`).
- Optional output directory override (default: `{{DEFAULT_OUTPUT_DIR}}`).

## Defaults (and how to override)

- **Default output directory:** `{{DEFAULT_OUTPUT_DIR}}`
  - If the user specifies a different directory at invocation time (e.g. "save it to my Desktop", "put it in Projects/Foo"), use that and create it if missing (`New-Item -ItemType Directory -Force`).
- **Default format:** `.docx` — always use "Download as .docx" unless the user explicitly says they want `.vtt` (e.g. "give me the vtt", "I need the captions file", "with timestamps as vtt"). For `.vtt`, click `Download as .vtt` instead.
- **Filename:** `<meeting-slug>-<YYYY-MM-DD>-<HHMM>.<ext>` where **both `<YYYY-MM-DD>` and `<HHMM>` come from the *recording start time*** — i.e. the "Meeting started at ..." line in the chat thread, in the user's local timezone. 24-hour, no colon. Example: a recording that started today at 8:31 AM local becomes `weekly-standup-2026-05-23-0831.docx`.

## CRITICAL: Timestamp source

**ALWAYS use the recording start time from the chat thread's "Meeting started at H:MM AM/PM" line. NEVER use the Recap picker's combobox label.**

The Recap picker (the "Select the meeting by time" dropdown) shows the **scheduled** occurrence the transcript was associated with — e.g. it may say "Friday, May 22, 2026 6:15 PM - 6:30 PM" even when the actual recording was started early at 8:31 AM today. The scheduled time is **wrong** for filename purposes. The user wants the time the recording actually started, which is only visible in the chat thread.

If you accidentally proceeded past Step 3 without capturing the recording start time, **go back to the chat thread and capture it before saving the file**.

## The verified flow

The Transcript pane uses **virtualization** — DOM scraping is unreliable for anything longer than ~3 minutes. Always automate the native Download button.

### Step 1: Find the chat
- If user gave a subject: `m365_search_chats query="<subject>"` → grab the `id` field (`19:meeting_<...>@thread.v2`).
- If user gave a `/l/chat/<id>/...` URL, extract `<id>` from the URL.
- URL-encode the chat ID (`:` → `%3A`, `@` → `%40`, or `encodeURIComponent`).

### Step 2: Open the chat (signed-in Teams web surface)
- Navigate to: `https://teams.microsoft.com/l/chat/<encoded-id>/0?tenantId={{TENANT_ID}}`
- If "Choose how to open" appears, click "Use the web app instead".
- If that page goes blank, navigate directly to `https://teams.cloud.microsoft/v2/#/l/chat/<encoded-id>/0?tenantId={{TENANT_ID}}`.
- Wait ~8 s for the chat thread to load.

### Step 3 (CRITICAL): Capture recording date AND start time from the chat thread

**Do this BEFORE clicking the Transcript button.** Once you navigate away to the Recap surface you lose easy access to the chat thread's "Meeting started" lines.

Scroll to the bottom of the chat thread, then extract the visually-lowest "Meeting started at ..." entry. Teams formats it three ways depending on recency:
- `Meeting started at 5/12 5:54 PM` — explicit M/D (older than ~1 week)
- `Meeting started at Monday 12:14 PM` — day name (within the last week)
- `Meeting started at 8:31 AM` — bare time (today)

You must resolve all three into a concrete `{date: YYYY-MM-DD, hhmm: HHMM}` in the user's local timezone. Use the system clock for "today" and the most-recent-past occurrence for day names.

```js
async (page) => {
  await page.evaluate(() => {
    const btns = Array.from(document.querySelectorAll('button'))
      .filter(b => /^Transcript$/i.test((b.textContent||'').trim()));
    if (btns.length) btns[btns.length-1].scrollIntoView({block:'end'});
  });
  await page.waitForTimeout(1500);

  return await page.evaluate(() => {
    const all = Array.from(document.querySelectorAll('div, span, li'))
      .filter(el => {
        const t = (el.textContent||'').trim();
        return /Meeting started at /i.test(t) && t.length < 200;
      })
      .map(el => ({y: el.getBoundingClientRect().top, text: (el.textContent||'').trim()}));
    all.sort((a,b)=> b.y - a.y);
    for (const c of all) {
      const after = c.text.replace(/^.*Meeting started at\s+/i, '').replace(/Meeting started.*$/i, '').trim();
      const mDate = after.match(/^(\d{1,2})\/(\d{1,2})\s+(\d{1,2}):(\d{2})\s*(AM|PM)/i);
      const mDay  = after.match(/^(Sun|Mon|Tue|Wed|Thu|Fri|Sat)[a-z]*\s+(\d{1,2}):(\d{2})\s*(AM|PM)/i);
      const mTime = after.match(/^(\d{1,2}):(\d{2})\s*(AM|PM)/i);
      if (mDate || mDay || mTime) {
        return {
          raw: after.substring(0, 60),
          mDate: mDate && {month: +mDate[1], day: +mDate[2], hr: +mDate[3], min: mDate[4], ampm: mDate[5].toUpperCase()},
          mDay:  mDay  && {dayName: mDay[1], hr: +mDay[2], min: mDay[3], ampm: mDay[4].toUpperCase()},
          mTime: !mDate && !mDay && mTime && {hr: +mTime[1], min: mTime[2], ampm: mTime[3].toUpperCase()}
        };
      }
    }
    return null;
  });
}
```

Then resolve to `{date, hhmm}` in PowerShell or JS using the system clock:
- **Case A (mDate):** assume current year; if M/D > today, subtract 1 year.
- **Case B (mDay):** walk back from today to find the most recent matching weekday (could be today).
- **Case C (mTime):** date = today.

Convert hr+AM/PM to 24-hour, then `HHMM = sprintf("%02d%s", h24, min)`.

**Verified mappings (PT, May 2026):**
- `Meeting started at 8:31 AM` on Sat 5/23 → `date=2026-05-23, hhmm=0831`
- `Meeting started at Thursday 10:25 AM` viewed on Sat 5/23 → `date=2026-05-21, hhmm=1025`
- `Meeting started at 5/12 5:54 PM` viewed in 2026 → `date=2026-05-12, hhmm=1754`

### Step 4: Open the Transcript pane

Click the **last** `Transcript` button in the chat thread (one button per recorded occurrence; last = most recent):

```js
const buttons = Array.from(document.querySelectorAll('button'))
  .filter(b => /^Transcript$/i.test((b.textContent||'').trim()));
buttons[buttons.length-1].click();
```

This routes to the Stream playback surface with the Transcript pane open in a right-hand iframe at `microsoft-my.sharepoint.com/.../xplatplugins.aspx`. Wait ~6 s for the iframe to load.

**DO NOT trust the Recap picker's combobox label** ("Select the meeting by time") — it shows the *scheduled* occurrence the recording is filed under, which can differ from the actual recording start. If the picker label and the chat-thread "Meeting started" line disagree, the chat thread wins.

### Step 5: Click Download → "Download as .docx" (or .vtt if requested)

```js
async (page, format = 'docx') => {
  const tFrame = page.frames().find(f => f.url().includes('xplatplugins.aspx'));
  if (!tFrame) throw new Error('Transcript iframe not found - wait longer for load');

  await tFrame.getByRole('button', {name: 'Download'}).click();
  await page.waitForTimeout(500);

  const menuLabel = format === 'vtt' ? /Download as \.vtt/ : /Download as \.docx/;

  const [download] = await Promise.all([
    page.waitForEvent('download', {timeout: 30000}),
    tFrame.getByRole('menuitem', {name: menuLabel}).click()
  ]);

  return download;
}
```

### Step 6: Save with `<slug>-<YYYY-MM-DD>-<HHMM>.<ext>`

```powershell
$dir = "<override or {{DEFAULT_OUTPUT_DIR}}>"
New-Item -ItemType Directory -Force -Path $dir | Out-Null
$slug = "<kebab-case of meeting subject>"
$date = "<YYYY-MM-DD from Step 3 — recording date>"
$hhmm = "<HHMM from Step 3 — recording start time, NOT scheduled time>"
$ext  = "docx"
$path = Join-Path $dir "$slug-$date-$hhmm.$ext"
# In Playwright: await download.saveAs(path)
```

Example final names (all use *recording* start time, never scheduled time):
- `weekly-standup-2026-05-23-0831.docx` — recording started today at 8:31 AM (regardless of when the meeting was scheduled)
- `weekly-standup-2026-05-21-1025.docx` — recording started Thu 5/21 at 10:25 AM

### Step 7 (REQUIRED): Verify the file, then close the Playwright browser

After `download.saveAs(path)` resolves, confirm the file actually landed before tearing down the browser. Then close Playwright so the Edge window does not stay open.

```powershell
$f = Get-Item -LiteralPath $path -ErrorAction Stop
if ($f.Length -lt 1000) { throw "Downloaded file is suspiciously small ($($f.Length) bytes) - keeping browser open for inspection" }
```

Only if the file exists AND is a plausible size (>1 KB), call `playwright-browser_close` to shut down the browser. If verification fails, leave the browser open and surface the error to the user — do NOT close it on failure, because the user may need to retry from the current page state.

```
playwright-browser_close   # one tool call, no args
```

This is required for every invocation of the skill, including automated/background runs (e.g. when called from `/transcript-watcher`). A leftover Edge window using the Playwright user-data-dir profile will block the next Playwright session and require manual cleanup.

### Fallback time source

If Step 3 finds no "Meeting started at" line at all:
1. Look for any "Meeting started:" text inside the most recent meeting *card* (it usually shows the start time too).
2. Ask the user via `m_ask_user`: "What time did the meeting actually start (local time)?"
3. **NEVER** use the Recap picker's scheduled time. **NEVER** use UTC time from the .docx metadata.

## Pitfalls

- **TIMESTAMP SOURCE:** The Recap picker shows scheduled occurrence time, not recording start time. Always pull the time from the chat thread's "Meeting started at ..." line.
- **Short `/meet/<id>?p=...` URLs** cannot be resolved via Graph `JoinWebUrl` filter. Use the meeting subject or chat link instead.
- **Navigating to `/meet/<id>` from a fresh browser context** lands on the anonymous join launcher. Always navigate via `/l/chat/...`.
- **The Recap-tab occurrence picker** can lag by 15+ minutes after a recording ends. The chat-thread Transcript button is faster and lets you grab the recording start time before navigating away.
- **DON'T scrape the transcript DOM** for anything longer than ~3 minutes. Virtualization will silently truncate. Always use the native Download button.
- **The Download submenu lives inside the xplatplugins.aspx iframe.** Playwright's `getByRole` reaches into it. Raw `evaluate` + manual focus/keyboard fails.
- **Wire `page.waitForEvent('download')` BEFORE the menu-item click**, using `Promise.all`. Setting it up after misses the event.
- **OneDrive does NOT auto-export the transcript .docx.** Only the `.mp4` recording auto-saves. This skill IS the export.
- **The chat may contain multiple meeting occurrences.** Grab the LAST `Transcript` button AND the visually-lowest "Meeting started at ..." line — they correspond.

## Privacy
Transcripts often contain confidential meeting content. Save locally only. Never auto-forward, summarize externally, or send to chat without explicit user OK.

## Output
Report to user: meeting subject, **recording start date + time (local)**, file path, file size, and offer to extract text or summarize. If the recap picker's scheduled time differed from the recording start time, mention that too so the user knows the filename reflects the actual start. Mention that the Playwright browser has been closed (or, on failure, that it was left open for inspection).

## Method A (Graph API, currently unavailable in most tenants)
`GET /me/onlineMeetings/{id}/transcripts/{tid}/content` would be cleanest but requires `OnlineMeetings.Read` + `OnlineMeetingTranscript.Read.All` (admin consent required). If your tenant grants these scopes to your Clawpilot token, prefer Graph; otherwise use the Playwright flow above.