parse_document_csv
Parses a CSV file into documents and calls a function on each document.
This function can handle CSV files with or without a header row, but if a header row is not present you must:
- set the named parameter
use_header_rowtofalse. - specify the document field names to use by setting the named parameter
csv_field_names.
Syntax
parse_document_csv( filename, handler [, params ] )
Arguments
| Argument | Description |
|---|---|
filename
|
(string) The path and file name of the CSV file to parse into documents. |
handler
|
(document_handler_function) The function to call on each document that is parsed from the CSV file. |
params
|
(table) A table of named parameters to configure parsing. The table maps parameter names (String) to parameter values. For information about the parameters that you can set, see the following table. |
Named Parameters
| Named Parameter | Description |
|---|---|
content_field
|
(string, default DRECONTENT) The name of the field, in the CSV file, to use as the document content. |
csv_field_names
|
(string list) A list of names for the fields that exist in the CSV file. This overrides any header row, if one is present. |
reference_field
|
(string, default DREREFERENCE) The name of the field, in the CSV file, to use as the document reference. |
use_header_row
|
(boolean, default TRUE) Specify whether the CSV file includes a header row (whether the first row is a list of field names and not values). If this parameter is True and you do not set csv_field_names, the field names in the header row are used as the names of the document fields. |
Example
The following example parses a CSV file named data.csv, and calls the function documentHandler on each document. The values in the field item_id become document references and the values in the field body become document content.
function documentHandler(document)
-- do something, for example
print(document:getReference())
end
...
parse_document_csv("./data.csv", documentHandler, {
reference_field="item_id",
content_field="body"
})
The following example shows how to provide field names when there is no header row in the CSV file:
parse_document_csv("./data_no_header.csv", documentHandler, {
use_header_row=false,
csv_field_names={"DREREFERENCE", "title", "modified", "DRECONTENT"}
})
Returns
Nil.