Summary

After evaluating Obsidian and considering migrating from OneNote to Obsidian I had to figure out how I could migrate my content. That is the topic of this post. Migrating my notes.

As of 2024-03-18, the auto-conversion process has been a major pain and largely unsuccessful. I will try to highlight key points as I understand them in case you are in my boat and trying to migrate. The export options have degraded over time and has made things difficult. The disappearance of .docx as an export format in particular I think really hurt the process as a lot of the existing options seems dependent upon this.

This whole experience is solid grounding that my choice to get away from proprietary formats is right for me. But it's a personal decision. While I absolutely love the tool, what is their next change going to be and will I be able to move at that point if that change doesn't work for me? There have already been a few WTF moments where I thought I had lost everything over the years. At least with open formats I don't risk losing access to my own data or having to paying some fee.

Backup and Fail-safe

Using the OneNote App downloaded and installed locally (Not 365 not web) allowed me to export notebooks to PDFs. They look fairly intact to me. I fell I can always go back to these if I need to. In fact this may be enough for you. Just start Obsidian fresh. Out with the old an in with the new. This removes all technical cruft and ensures the most compatible smooth path forward. That is if you can live without those notes in Obsidian or are willing to rewrite them.

The Easiest - Technically Speaking

If you just want something easy and don't mind time consuming, good old copy & paste does a pretty good job compared to most. OneNote Ctrl-A select all Ctrl-C copy > Obsidian Ctrl-V paste. Just be sure to double check everything comes over. Images were often borked. Often MOST of the images were replaced with garbled text. But it seemed like this was the result of some formatting issue in the .md file and once I fixed the first broken image all the ones below were fixed. Some times I was able to simply backspace at the right spot to fix the issue. But I could not find a reliable spot to backspace. So I suspect if we could fix the paste or find a reliable fix for the formatting issues of images we could have a fairly simple path. So this is easy but time consuming. Though, I don't trust ANY of the methods fully, so if the data is important you had better review in detail anyway.

The Most Complete - Not all Text

pdf2docx This maintained the tables and images of the methods the best. and the text is arguably the easiest to copy reliably. So this might be a winner. But there were obvious blocks of missing text. YMMV.

The Most Complete - No Tables

There is probably a better way to do this but this is what has yielded the best results so far as having the most complete and accurate copy of the original data. Though it still lost my Tables. So if the tables are more important you might favor other methods. Hold on to your hats folks. The ride is about to get bumpy. See the various sections below for more details. What I did:

  1. Export PDFs On Windows via OneNote App (Not the Online Office Version). Exported each NoteBook as a single PDF. Highlight Notebook to Export > File > Export > Notebook > PDF
  2. Add a Cover Page to PDF Open the PDF in Libre Office Draw. Added new Page 1. Grabbed a screenshot and pasted into Page 1. Export as PDF with 1. Lossless Compression 2. unckeck Reduce image resolution. (Unless you want to make them smaller). I added "draw" to the filename (or new folder) and removed spaces OrigFilenameDraw.pdf. Technically you can skip this step. For mine I lost the first page as it convertted it to the cover page of the "book". YMMV. But when I added the "cover" page I shows up twice. So I may be misunderstanding what is going on. Better twice then lose things.
  3. Import PDF into Calibre Calibre For some reason this failed until I used the FileSystem rt-click to Open In > Other Application > E-book Viewer.
  4. Calibre Convert Book Select Title > Convert > Individually > Output Format: DOCX (Upper Right) > Ok. I don't see anything for image quality.
  5. Save DOCX to Disk Select Title > Format "DOCX" (rt-Click) > Save DOCX to Disk.
  6. Open .docx save as .odt Open the .docx file in Libre Office Writer. Save As, change to .odt.
  7. Convert ODT to MD using Pandoc Pandoc pandoc --extract-media ./<foldername_for_images> -i filename.odt -o filename.md e.g. pandoc --extract-media ./attachments -i AWSServerlessDraw.odt -o AWSServerlessDrawOdt.md
  8. Copy to Vault Via the Filesystem I copied the directory and the .md file to my vailt inside a folder called OneNoteConverted and pasted them in. They automatically loaded into Obsidian fore review.
  9. Deal with Tables As I said in the opening paragraph, this method still doesnt handle tables well. It might make sense to also use one of the methods below that does the tables well and copy those over. Use the original PDFs as your reference for both. Perhaps pdf2docx

More Details and Conversion Methods

Loosely in order from best (top) to worst:

Programs

File Formats

Other Ideas - Use JavaScript to isolate and extract the Live HTML WebPage generated from OneNote.com online views to convert HTML to MD. Not sure exactly where to begin here but this seems highly interesting idea. Run with it please. - Power Automate on Windows - AutoIT - GUI Based UI automation. Perhaps something that can automate the UI might work around the embedded images issues. - Digging into the object model of OneNote formats. Using a Windows based language to use and include the libraries into a binary. That is if you can get those without a paid developer subscription. - Continue down the Github Rabbit Hole - Try GutHub copilot to see if AI could help generate a viable solution. - Try other programming libraries to extract PDFs to MD