Troubleshooting
Common Issues
AccessDeniedException on Bedrock
Symptom: Lambda throws AccessDeniedException and the resume lands in the DLQ. CloudWatch
Logs shows:
AccessDeniedException: User is not authorized to perform bedrock:InvokeModel
Causes and fixes:
| Cause | Fix |
|---|---|
| Bedrock model access not enabled | Open the Bedrock console → Model access → enable Claude Sonnet 4.6 in the deployment region |
Stack deployed to a region where global.anthropic.claude-sonnet-4-6 is not supported | Redeploy to a supported region (e.g. us-east-1, eu-west-1) |
Resume Lands in the Dead Letter Queue
Symptom: CVIngestionDLQAlarm fires. The CVIngestionDLQDepth metric is ≥ 1.
Steps:
- Open CloudWatch → Log groups → /aws/lambda/ProcessCVFunction
- Find the log stream for the failed invocation and identify the root cause
- Fix the root cause (see other entries in this guide for specific errors)
- Redrive the failed message:
- Open the SQS Console
- Select
CVIngestionDLQ - Dead-letter queue actions → Start DLQ redrive
- Redrive destination:
CVIngestionQueue - Click Redrive
Lambda picks up the message automatically and reprocesses the resume.
NoSuchKey — Resume Not Found in S3
Symptom: Lambda logs:
Error retrieving document from S3 [bucket=my-bucket, key=resumes/john-doe.pdf]: NoSuchKey
Cause: The resume was deleted from S3 between the upload event and Lambda processing. This is uncommon but can happen if a lifecycle policy or manual deletion removes the object before the queue is drained.
Fix: Re-upload the resume. The pipeline will process it automatically.
Lambda Duration Alarm Fires
Symptom: ProcessResumeDurationAlarm fires. Lambda duration p95 ≥ 4 minutes.
Cause: Bedrock latency spike. Claude Sonnet 4.6 inference is typically fast, but occasional latency spikes occur during high AWS load periods.
Fix: No immediate action required — SQS will redeliver the message if the Lambda times out. Check the AWS Service Health Dashboard to confirm Bedrock is operating normally.
Queue Age Alarm Fires
Symptom: CVIngestionQueueAgeAlarm fires. Messages are waiting in CVIngestionQueue
for longer than 10 minutes.
Causes and fixes:
| Cause | Fix |
|---|---|
| Lambda throttled (concurrency limit reached) | Check Lambda concurrency metrics in CloudWatch; request a quota increase if needed |
| Lambda in a persistent error loop | Check CloudWatch Logs for repeated errors; fix root cause |
| Burst of uploads exceeds processing capacity | Expected — SQS will drain automatically as Lambda catches up |
SNS Subscription Not Confirmed — No Alert Emails
Symptom: CloudWatch alarms fire but no email is received.
Cause: The SNS subscription requires email confirmation. AWS sends a confirmation email
to the alertEmail address on first deploy — if it was not confirmed, alerts are not
delivered.
Fix:
- Open the SNS console → Topics → CVIngestionAlertTopic
- Check Subscriptions — if status is
PendingConfirmation, resend the confirmation - Check your spam folder for the original confirmation email from
no-reply@sns.amazonaws.com
Unsupported Document Format
Symptom: Lambda logs a format error or Bedrock returns an unexpected response for a specific file.
Supported formats: PDF, DOCX, DOC. The format is detected from the file extension.
Fix: Ensure the uploaded file has the correct extension (.pdf, .docx, or .doc)
and that the file is not corrupted. Re-upload with the correct extension if needed.
Useful CloudWatch Logs Insights Queries
Open CloudWatch → Logs Insights, select the log group
/aws/lambda/ProcessCVFunction, and run these queries.
Find all errors in the last hour:
fields @timestamp, @message
| filter @message like /ERROR/
| sort @timestamp desc
| limit 50
Trace a specific resume by filename:
fields @timestamp, @message
| filter @message like /your-resume.pdf/
| sort @timestamp asc
Find slow invocations (duration > 3 minutes):
fields @timestamp, @duration, @requestId
| filter @duration > 180000
| sort @duration desc
| limit 20