ReferenceSpecs

KB Source Resolution

Technical specification for the Knowledge Base source resolution algorithm in pair-cli.

Technical specification for the Knowledge Base (KB) source resolution algorithm in pair-cli.

Resolution Algorithm

Precedence Order

KB source is resolved in the following order (highest to lowest priority):

  1. CLI flag --source <url|path> — Explicit user-provided source
  2. Monorepo defaultpackages/knowledge-hub/dataset (when running from monorepo)
  3. GitHub release auto-download — Latest release from configured repository

Decision Tree

                  KB Source Resolution
                         |
                 --source provided?
                /                  \
              YES                   NO
               |                     |
          --offline?            In monorepo?
          /        \            /          \
        YES        NO        YES          NO
         |          |          |            |
    Validate    Parse      Use monorepo   Download
    is local    source     dataset        from GitHub
    path        type

Source Type Detection

URL Detection

function isUrl(source: string): boolean {
  return source.startsWith('http://') || source.startsWith('https://')
}

Local Path Detection

function isLocalPath(source: string): boolean {
  return (
    source.startsWith('/') ||          // Absolute Unix path
    source.startsWith('./') ||         // Relative path
    source.startsWith('../') ||        // Relative parent path
    /^[A-Za-z]:[/\\]/.test(source)    // Windows absolute path
  )
}

Monorepo Detection

function isMonorepo(): boolean {
  const monorepoKbPath = path.resolve(process.cwd(), 'packages/knowledge-hub/dataset')
  return fs.existsSync(monorepoKbPath)
}

Option Validation

Constraint: --offline + --source Combination

--source value--offlineValid?Action
Not providedNoYesAuto-download or use monorepo
Not providedYesNoERROR: --offline requires --source
HTTP/HTTPS URLNoYesDownload from URL
HTTP/HTTPS URLYesNoERROR: Cannot combine URL with --offline
Local pathNoYesUse local source
Local pathYesYesUse local source (offline mode)

Resolution Implementation

interface KbSource {
  type: 'url' | 'local' | 'monorepo' | 'github-release'
  location: string
  requiresDownload: boolean
}
 
async function resolveKbSource(options: {
  source?: string
  offline?: boolean
}): Promise<KbSource> {
  // Precedence 1: Explicit --source flag
  if (options.source) {
    if (isUrl(options.source)) {
      if (options.offline) {
        throw new Error('Cannot use --offline with remote URL')
      }
      return { type: 'url', location: options.source, requiresDownload: true }
    } else if (isLocalPath(options.source)) {
      const resolvedPath = path.resolve(options.source)
      if (!fs.existsSync(resolvedPath)) {
        throw new Error(`KB source path does not exist: ${resolvedPath}`)
      }
      return { type: 'local', location: resolvedPath, requiresDownload: false }
    } else {
      throw new Error(`Invalid source format: ${options.source}`)
    }
  }
 
  // Precedence 2: Monorepo default
  if (isMonorepo()) {
    return {
      type: 'monorepo',
      location: path.resolve('packages/knowledge-hub/dataset'),
      requiresDownload: false,
    }
  }
 
  // Precedence 3: GitHub release auto-download
  if (options.offline) {
    throw new Error('Cannot auto-download KB in offline mode')
  }
  return {
    type: 'github-release',
    location: await getLatestReleaseUrl(),
    requiresDownload: true,
  }
}

Error Catalog

CodeErrorExitCauseResolution
E001Offline without source1--offline without --sourceProvide --source
E002Offline with remote URL1--offline with HTTP URLUse local path
E003Source path not found1--source points to missing pathVerify path
E004Invalid source format1Source is neither URL nor pathUse valid format
E005Download network error2Network failureCheck connectivity
E006Checksum validation failed2SHA256 mismatchRetry download
E007Auto-download in offline1No source + offline modeProvide local source

Download Process

When requiresDownload: true for URL sources:

  1. HEAD request — Fetch URL metadata
  2. Fetch checksum — Download .sha256 checksum file
  3. Download KB — GET request with progress tracking
  4. Validate SHA256 — Verify integrity (if checksum available)
  5. Extract to cache — Store at ~/.pair/kb/{version}/

Cache Strategy

Cache Location: ~/.pair/kb/{version}/

Downloaded KBs are cached to avoid redundant downloads. Cache key is derived from the download URL or version string.

Configuration

Environment Variables

VariableDescriptionDefault
PAIR_KB_CACHE_DIROverride KB cache directory~/.pair/kb
PAIR_KB_DEFAULT_URLOverride default GitHub release URLGitHub latest release
PAIR_DIAGEnable diagnostic logging0 (disabled)