Monday, April 8, 2019

Can't rename PDF with title copied from pdf itself


The problem is extremely simple and can be reproduced anytime. I was trying to copy the title from the PDF and using that clipboard to rename the PDF but whenever I was trying to do that I was getting error like - "A file name can't contain any of the following characters: /:?"<>|". I am pretty sure there are no such characters in the PDF title. Also, when I paste the text into some text editor(notepad,MS-word,google search bar) and then copy again the text to rename the PDF, it works.


Why this happens?


Operating system: Windows 10
Application: Adobe PDF


Answer



If you paste the copied title into a hex editor or other program that does not filter input characters, you will likely discover that there are some characters that are either non-printable or otherwise violate the file name rules in Win32 (which are a little more extensive than are given in that error message; for example while filenames can include spaces, horizontal tabs are not permitted and yet can be copy-pasted). While I haven't used Adobe PDF in particular for years, text copied from a PDF is very often slightly "corrupted" (i.e. not what you expect) in some way.


Running copied text through a program that only accepts text is actually a great way to detect and/or filter out such unexpected characters. It also enables you to do things like drop unexpected whitespace.


No comments:

Post a Comment

hard drive - Leaving bad sectors in unformatted partition?

Laptop was acting really weird, and copy and seek times were really slow, so I decided to scan the hard drive surface. I have a couple hundr...