Feature proposal: use date+time of imported PDF files instead of date of import

Using Paperwork 2.0.1, Flatpack, Linux mint, since some
weeks, I’m very excited: I was looking for this for long time.

I started scanning some percentage of my paper using a scanner
with network to a local own SMB share 4 year ago. In sum its
approx 300 files and 500MByte.

The “Many PDFs in one shot” import feature is really great. Also,
the “ignore everything already in the database” behavior is amazing.
That import using Paperwork 2.0.1 took about 3 hours but all documents
got the import date. Because Paperwork is so fast there are a lot of
identical date/time dirs with suffix like
20201226_1901_39_27
20201226_1901_39_53
20201226_1901_39_8

I think its better to use date+time of imported PDF files because:

  • importing of old unsorted collections of pdf files
  • maybe convincing new Paperwork users with flawless import
    (see recent post of user gege)
  • a lot of mid price and high price scanners allow to
    scan to SMB/FTP share or even email with perfectly
    operating with ADFs an paper formats and contrast settings
  • most of the scanners unfortunately just count up numbers in the
    generated file names
  • for me a workflow with first scan the paper to SMB share
    for some weeks and the import everything to Paperwork
    and possibly give labels from, time to tome would be perfect
  • Furthermore some of the scanned documents I copy to project
    folders (e.g. “tax declaration 2019”) but leave them all in the
    SMB share folder
  • change date manually is not a good way for many files
  • easier to implement than guessing the date from scanned
    document

BR
Tom

:smiley:

1 Like

I agree. I am looking for a way to migrate around 300 documents from the last 5 years to paperwork. Having all of them imported with today’s date is not an option and changing the date on all 300 of them is something I would like to avoid.
I would highly appreciate if there was a way to tell the import to use the file date as document date.
BR
Dieter

Finally found a way to migrate my existing documents. I was able to draw a table with filename and document date from my existing system. Then I used a shell-script to generate the folders, put the documents in and renamed them to doc.pdf.
Paperwork is now showing them as if they were imported by user interface. Even assigning labels automatically was successful.
BR
Dieter

1 Like

Hello,

I forgot to mention here that this is actually a long-standing issue: PDF Import: automatically figure out the date of the document (#338) · Issues · World / OpenPaperwork / paperwork · GitLab
I still need to figure out a way to handle this problem gracefully.

Best regards,