Harvest Error -- Keyword Format
Overview
How to fix harvest validation errors where the keyword field is present but contains a string instead of the required array of strings.
Source
Details
Harvester Guide Pages
| Error Type | Page |
|---|---|
| Getting Started | What is Harvesting? | Understanding Harvest Errors |
| Quick Lookup | FAQ Overview | Quick Reference |
| Date & Time | Date Format Errors (modified, issued) |
| Update Frequency | accrualPeriodicity Errors |
| License | License Field Errors |
| Contact Info | Email Format Errors (contactPoint.hasEmail) |
| Keywords/Tags | Missing Keywords | Keyword Format |
| Missing Fields | Missing Required Fields (modified, keyword, description) |
| File Structure | Transformation Errors (ISO 19115, XML, file problems) |
| Other Issues | Duplicates, Sync Failures, Unrecognized Records |
keyword – wrong format errors
This error affects about 26 records. It occurs when the keyword field exists in the record but its value is a plain string instead of an array of strings. This is different from a missing keyword field. The field is there – it just has the wrong data type.
What you see
$.keyword does not match any of the acceptable formats
What this means
The DCAT-US schema requires keyword to be an array – a list – even if there is only one keyword. A plain string is not accepted.
Common problems and fixes
Single string instead of an array:
-
Wrong:
"keyword": "environment" -
Correct:
"keyword": ["environment"]
Null value:
-
Wrong:
"keyword": null -
Correct:
"keyword": ["unspecified"]or add meaningful keywords
Non-string elements in the array:
-
Wrong:
"keyword": [1, 2, "health"] -
Correct:
"keyword": ["health"]
What the correct format looks like
Keywords must always be an array of strings, wrapped in square brackets:
"keyword": ["public health", "surveillance", "CDC"]
Even a single keyword must be in an array:
"keyword": ["environment"]
If you can edit your metadata directly
Wrap the keyword value in square brackets to make it an array. If the field contains null or non-string values, replace them with descriptive text keywords.
If you cannot edit the metadata yourself
Contact whoever manages your agency’s metadata publishing system and tell them:
“The
keywordfield on some of our datasets is formatted as a plain string instead of an array. It must always be an array of strings, even for a single keyword. For example,"environment"should be["environment"]. This is causing validation failures on harvest.data.gov.”