PDF to Excel Conversion

Hi!

I have a query. I want your help.

I want to add the data from PDF file to excel sheet. The data is in the form of table in PDF and I want it to store in excel.

My flow:

I am reading PDF file storing it in a variable, passing it to excel sheet. It is giving me error of “” Invalid array format “”. Then I tried converting PDF data stored in a variable to string using command .toString and then passing it to excel sheet but it is giving me error “Selectors: Error converting value “Hayyan Khan khaann Ali Khan khann Hamza anwar Malik Shahan Saeed khan” to type ‘System.Collections.Generic.IDictionary2[System.String,System.Collections.Generic.IDictionary2[System.String,System.String]]’. Path ‘’, line 1, position 80… If the issue persists, please refer to Help Center to make sure the activity ‘Save table’ was set up correctly.”

The data I have in PDF

I hope you get my scenario. Waiting for your feedback?
NOTE: I don’t want to use any 3rd party app for it.

Thank you!

Best Regards,
Muhammad Hayyan Khan

@Muhammad_Hayyan The input you are passing to excel might not correct.
for example: if you are using write excel file activity then value must be in below format
{
“My List”: [
{
“Country”: “Russia”,
“Capital”: “Moscow”
},
{
“Country”: “USA”,
“Capital”: “Washington DC”
}
]
}

please refer this article: Write Excel File – ElectroNeek Help Center

1 Like

No! I am following the same format. Thank you anyways! :blush:

@Muhammad_Hayyan I am reading PDF file storing it in a variable : Which activity you are using to read pdf? could you send the value of this variable?

Also, you can use Intellidocs to extract table data from PDF which is the easiest way to do it. Please refer below article for more information about intellidocs

1 Like

Yes! I have tried using intellidocs. But it will be quite expensive solution.

I think I’ve found a solution. For this to work, the excel file must already exist.

PDF input:

Workflow:

Excel output:
image

Convert pdf_content into an array (which I called array) using split() to get rid of “\r\n” and then map() with split() again to get rid of spaces:

pdf_content.split("\r\n").map(function(row){return row.split(' ');});

Then add a counter i to control the loop based on the length of array and use it to append each array item into the excel sheet as a new row.

1 Like