Monday, August 05, 2013

UNO IllegalArgument during import phase: Source file cannot be read. URL seems to be an unsupported one.

Creating Word documents in a fully linux-based environment can be tricky. There is a trick you can do which is basically saving html with a .doc extension, which allows Word to open the resulting document, but it has some drawbacks - if you try to make any changes and save it again, you can't save it as a .doc, etc etc.

So we have a multi-step chain of services set up on our Quill Platform to allow us to render our articles as genuine Word documents.

In the Rails app:

  1. User clicks 'download as word doc' on an article - this results in a request like this: GET /articles/1234.doc
  2. Check for an existing word doc representing the correct version of the requested article. If it's there, serve it back as a document. If not....
    1. Render as a string, and save a WordDocumentConversion object which encapsulates the resulting HTML
    2. Submit a job to Resque to perform the actual conversion.
    3. Redirect the user, and flash a message saying "That document might take a minute or two to generate - we'll email it to you when it's ready"
In the Resque job:
  1. Load the WordDocumentConversion
  2. POST the saved HTML as a file input to our DocumentConverter web service - a little Sinatra app which provides a RESTful endpoint around LibreOffice
  3. Save the response as a file named .doc, in binary mode, and email it to the user.
In the DocumentConverter web service:
  1. accept the POST-ed HTML content
  2. invoke UNOCONV - a command line python script that wraps the LibreOffice / OpenOffice headless document conversion service.
  3. respond with the binary content of the returned Word doc.

It all works pretty well, most of the time, and was a good exercise in building complexity through keeping each individual part very simple. However, we recently moved our live platform servers from the US-EAST EC2 region over to the EU-WEST region, and took the opportunity to rebuild them from scratch on updated Ubuntu, and while setting up our staging server we got the above error ( 'UNO IllegalArgument during import phase: Source file cannot be read. URL seems to be an unsupported one.' ) which had us scratching our heads for most of Friday.

To cut a long story short, this is another instance of what we refer as "Tao errors" - the error which can be seen is not the true error. When it says that the URL is unsupported, what it actually means is "I can't handle the file format you've requested" - usually because there are some LibreOffice OpenOffice packages missing. 

If you've only installed the base & core packages, that's not enough - to be able to render Word documents, you need to actually install the "writer" package as well. A quick scan of the unoconv documentation does give you this little tidbit -

Various sub-packages are needed for specific import or export filters, e.g. XML-based filters require the xsltfilter subpackage, e.g. libobasis3.5-xsltfilter.
ImportantNeglecting these requirements will cause unoconv to fail with unhelpful and confusing error messages.
- so I guess we were warned... but still, problem solved at last.


HanzKlotz said...

U save my life. thank you

QUIZVOOK said...

Awesome post thanks a lot for sharing with us.

viji said...

you are posting a good information for people and keep maintain and give more updates too.

seo company in india

aashi said...

Someone essentially lend a hand to make severely posts I would state. That is the very first time I frequented your website page and thus far? I surprised with the analysis you made to create this particular submit incredible. Fantastic job!
Digital Marketing Training in Chennai
Java Training in Chennai
Informatica Training in Chennai

mahalyasree said...

It's like you read my mind! You seem to know a lot about this, like you wrote the book in it or something. I think that you can do with some pics to drive the message home a little bit, but instead of that, this is fantastic blog. A great read. I will definitely be back.
Office Interiors in Chennai
Interior Decorators in Chennai

Philips Huges said...

Its a wonderful post and very helpful, thanks for all this information. You are including better information regarding this topic in an effective way.Thank you so much

Personal Installment Loans
Payday Cash Advance loan
Title Car loan
Cash Advance Loan

melinamia said...

Hi there, You have done an incredible job. I’ll definitely digg it and personally recommend to my friends. I’m sure they’ll be benefited from this site..Keep update more excellent posts..

Professional movers singapore
Movers company in singapore
Office movers singapore

saravanan said...

This information is impressive; I am inspired with your post writing style & how continuously you describe this topic.
Web Design Company in Chennai

sivaranjani said...

wonderful post and very helpful, thanks for all this information. You are including better information regarding this topic in an effective way.Thank you so much
SEO Company in India

subuvenni said...

i really like this blog.And i got more information's from this blog.thanks for sharing!!!!SEO Company in Chennai

bava mahe said...

Great articles, first of all Thanks for writing such lovely Post! Earlier I thought that posts are the only most important thing on

any blog. But here at Shout me loud I found how important other elements are for your blog.Keep update more posts..
AWS Training in Chennai

merin mary said...

There are lots of information about latest technology and how to get trained in them, like this have spread around the web, but this is a unique one according to me. The strategy you have updated here will make me to get trained in future technologies. By the way you are running a great blog. Thanks for sharing this.
Cloud Computing Training in Chennai

lakshmi deepa said...

Thank you for your sharing information..Its very interesting and useful.. awesome article
SAP Training in Chennai
SAP Basis Training in Chennai
SAP SD Training in Chennai
SAP FICO Training in Chennai

ananthi said...

this is very nice blog this studying course information very useful to everyone who have learning this information.this education information is very helpful to start my carrier with technology.

Data Warehousing Training in chennai