The text_to_docs function splits a file into multiple documents.
text_to_docs( doc, sectionName, filename)
| Argument | Description | 
|---|---|
| doc | (LuaDocument) The document that you want to divide into multiple documents. | 
| sectionName | (string) The name of the section in the CFS configuration file that contains the TextToDocs configuration parameters. For information about these parameters, see TextToDocs Task Parameters. | 
| filename | (string) The file that contains the text to be converted (the original file that resulted in the document). | 
LuaDocuments. A list of document objects representing the documents that are produced.
You might have a connector ingesting files from a repository, but want to split those files into multiple documents. The following example uses the get_filename function to find the path of the file associated with an ingested document, and uses the text_to_docs function to generate multiple documents. This example splits the file using settings in the [MyTextToDocs] section of the CFS configuration file. It then calls the ingest function to add the resulting documents to the ingest queue. 
function handler(document)
   if document:hasField("PROCESSED") then
     return true
   end
   
   local file = get_filename(document)
   local docs = text_to_docs(document, "MyTextToDocs", file)
   
   for i, doc in ipairs(docs) do
      doc:addField("PROCESSED", "YES")
      ingest(doc)
   end
   
   return true
end
        In this example, the original documents are also indexed. If you want to index only the documents generated by the text_to_docs function, you could return false from the handler function.
|  |