Difference between revisions of "New PND format"

From Pandora Wiki
Jump to: navigation, search
m (removing forgotten trash)
(another issue solve)
Line 169: Line 169:
  
 
A new format is being developed for the DragonBox Pyra handheld. Here are the requirements/notes:
 
A new format is being developed for the DragonBox Pyra handheld. Here are the requirements/notes:
 
The compatibility layer should have its own table?
 
  
 
{| class="wikitable"
 
{| class="wikitable"
Line 181: Line 179:
 
* no
 
* no
 
** Pandora is softfp and Pyra is hardfp - it's unlikely one package could work on both systems anyway.
 
** Pandora is softfp and Pyra is hardfp - it's unlikely one package could work on both systems anyway.
|| Very high || ||
+
** For current OpenPandora, we can already run Debian on SD-card.
 +
*** Maybe we can provide Pyra apps for OpenPandora using the Debian-on-sd-card opposed to Debian-on-internal-space
 +
|| Very high || Decided: compatibility with SuperZaxxon in the future PND system is not important || http://boards.openpandora.org/topic/15613-drop-support-for-superzaxxon-in-new-pnd-format/
 
* http://boards.openpandora.org/topic/15530-the-software-side/page-7#entry310691
 
* http://boards.openpandora.org/topic/15530-the-software-side/page-7#entry310691
 
* This decision affects several of the requirements and possible solutions on this table and should be decided sooner
 
* This decision affects several of the requirements and possible solutions on this table and should be decided sooner
Line 260: Line 260:
 
|| High || ||
 
|| High || ||
 
* Pyra will have larger internal space, then duplicate system isn't a large problem
 
* Pyra will have larger internal space, then duplicate system isn't a large problem
* For current OpenPandora, we can already run Debian on SD-card.
+
* The compatibility layer should have its own table?
** Maybe we can provide Pyra apps for OpenPandora using the Debian-on-sd-card opposed to Debian-on-internal-space
 
 
|-
 
|-
 
| Cross-building? || || || Undecided ||
 
| Cross-building? || || || Undecided ||

Revision as of 20:37, 9 February 2014

The current PND format has some shortcomings as listed below. This page should serve as a discussion page/white board for how the format could be improved.

Proposal for discussion, this is just some opening shots

Current situation

The current (ISO-based) PND format has the following shortcomings:

  • It currently uses ISO, SquashFS, and other filesystems... This means that there's no standard to follow, and libpnd needs to carry around support for a multitude of file systems.
  • It (often, but not always) uses the ISO file system which is inefficient at storing data, because:
    • It contains duplicate headers for each file entry[1] (one version with big-endian integers, one version with little-endian integers[2]) which leads to many kilobytes of wasted space.
    • Its file tables are fixed-size, and there are therefore limitations on how many folders you can have, what names you can give your files, how big your files can be etc. For instance, files can only have names that are max 31 characters long, all upper-case, and limited to the ASCII character encoding, and an ISO does only support a folder depth of 8[3]. To support more, the Joliet file system extension is needed (or isofs won't recognize all file paths), and the Joliet header can be placed practically anywhere in the file, which means additional seek times.
    • It needs various file system extensions to behave correctly, like for example Rock Ridge Interchange Protocol and Joliet. Without them, it becomes pretty useless as demonstrated above, and with them, it becomes difficult to read by a tool (You can only use isofs! There are no other tools out there for programmers to use!).
    • It's very difficult to use since you need special ISO making tools to create the image. ISO is a relatively rare format if you're not technically inclined.
    • If you want to add or remove files from it, it's near impossible if you use a compact ISO file, since there's no way that you can "expand" an existing ISO file.
  • The PND header data is at the end (or in the middle of the file if a screenshot is included) which makes it impossible for e.g. libmagic to recognize the PND file. It will instead recognize it as an ISO file. Having the header data at the end also makes it take a very long time to find the data, making tools and the libpnd library very inefficient.
  • The PND file uses a custom XML format for its metadata. There's no reason to do this, especially since the established ".desktop" file format fills exactly the same function.

(note: ".desktop" does not contain all metadata we have wanted over time, but it's of course possible to add extensions à la "X-Pandora-Whatever=xyz")

  • There's no "index" for the PND file. The whole file has to be scanned (albeit backwards) to find a PXML file, and there's a big chance for false positives etc.
  • Data is just appended linearly to the file so there's no order. If the format is to be extended (to e.g. include an icon file after the screenshot file), should the data just be appended as well?
  • UTF-8 is strictly the only encoding that is supported. If you make your PXML on a Windows machine, it won't work.

Proposed revisions

Step 1: File system

The file system for PNDs should be replaced by the uncompressed ZIP archive format. ZIP has the advantage that it's incredibly compact, and uncompressed ZIP makes it possible to read data from the file without having to do any decompression.

ZIP files are mountable using various implementations of zipfs, and most of these implementations won't store files in memory when they are being read if the ZIP file is uncompressed.

Step 2: Metadata

PND files should no longer require special tools for them to be created. Therefore, since ZIP files support random access on files, it shouldn't be necessary to append/prepend metadata to the file. The user simply includes the "PXML.xml" and "preview.png" files inside of the ZIP, and any tools that need information about the PND can simply go to the central directory of the ZIP file (or use a simple ZIP library to do it for them) and get the location of the file inside of the ZIP. This should also dramatically decrease loading times for PNDs.

Note; performance testing is needed; at the time we started, zip was enormously slower than plain ISO; with driver changes, it may or may not be so, hence our adopted multiple-filesystem-type system. zipfs is one possible option.

Step 3: Metadata format

Current PND files include so-called "PXML.xml" files. These files have a custom XML format that has a strange structure.

These files should be replaced by ".desktop" files. A PND can contain one or more ".desktop" files in its root directory that specify how an application should be launched. PND tools simply use all ".desktop" files they can find in the PND when creating launchers for the contents of the PND.

Advantages

  • There don't have to be any special tools for reading PND files. The package can be run on any platform using any programming language that can read ZIP files.
  • We can use existing facilities to manage the launching of applications. The ".desktop" files can basically be copied without modification into standard locations of the system, and all launchers will become aware of them.
  • Reading PND files becomes easier (because of better tool support... sorry, but you can't link to libpnd in all programming languages) and quicker (becuase of the central directory for random file access).

Benchmarks

A benchmark was carried through to measure the performance of various package implementations. The compared systems were:

  • ZIP-based uncompressed packages using fuse-zip (a zipfs implementation) to read the files
  • ZIP-based compressed packages also using fuse-zip.
  • ISO8859-based packages using isofs
  • CramFS-based packages.

A total of 9 files were generated for the test. They have sizes 1, 2, 3, 4, 16, 32, 64, 128 and 256 MB, and were generated via "dd if=/dev/urandom of=$file bs=1M count=$size".

Because the files contain random data, they did not compress very well. The resulting files were:

  • "zipcompressed.zip" 511.1 MB
  • "zipuncompressed.zip" 511.0 MB
  • "iso.iso" 511.3 MB
  • "cramfs.image" 95.4 MB (WOW!)

Some notes:

  • No write tests were done for obvious reasons
  • No random access tests were done since the Pandora will use SD cards and thus not get penalized from random access, and that's not what we're interested in anyways.
  • Before each test, the commands "sync; echo 3 > /proc/sys/vm/drop_caches" were run.

First test: linear read time

Command: "time cat mountpoint/* > /dev/null"

  • ISO: 8.119 seconds
  • CramFS: 1.488 seconds (WOW!)
  • Zip (compressed): 8.535 seconds
  • Zip (uncompressed): 8.290 seconds

Second test: RAM usage

-- TODO, have to find a good method of measuring this --

Remaining issues

  • How should libpnd tell the difference between old and new PND files? Should it depend on the "file" tool, should it run its own recognition algorithm, or what? Should the extension be changed to e.g. ".box" (Which was proposed in a thread)?
  • How to make sure that the ZIP files are uncompressed? Should we provide a script that "uncompresses" an ordinary ZIP file?
  • Is cramfs better than ZIP?
    • cramfs cannot be uncompressed like ZIP can
    • If you compare compressed ZIP and cramfs, cramfs is more efficient.
    • cramfs is harder to use than ZIP (for the developer).
    • cramfs needs to have all of its opened files in memory.

Upgrade path from the old PND format

A simple script can be written that extracts the old PND file, including the screenshot and the PXML.xml file. The PXML.xml file is then converted to one or many .desktop files, and the desktop files, the preview.png file, and the package contents are compacted into a ZIP that then is renamed to "*.pnd"

Usage scenario

Repackaging of an application from another package format

  • The user grabs the package for the application
  • He dumps the executable and all required libraries into a folder
  • He dumps a "screenshot.png" file into the folder that he's made
  • He copies the ".desktop" file(s) for the application from the old package, opens them, replaces "Exec=/usr/bin/bla" with "Exec=./bla" and saves them in the directory
  • He uses a ZIP archiver to make a ZIP out of the folder, making sure that he sets the "uncompressed" option.

Creating PNDs as part of a build process

  • The build tool creates a directory with all of the necessary information like above and invokes the "zip" utility to compress the folder.

Accessing a PND's contents for the...

...lazy programmer

  • The programmer extracts the PND into a folder via libzip and accesses its contents.

...pragmatic programmer

  • The programmer uses zipfs to mount the ZIP and accesses the file's contents.

...smart performance-aware programmer

  • The programmer fseek()'s the ZIP file until he finds 0x04034b50.
  • He jumps 18 bytes forward and reads the int at that location, storing it in "length".
  • He jumps 4 bytes and checks if this int matches "length". If it doesn't, it means that the file is compressed, and an error is reported.
  • He then jumps forward 4 bytes and stores the short at that location in a variable "nameLength".
  • Another 2-byte jump gets the short "extensionsLength".
  • He then jumps 2 bytes and reads "nameLength" amount of bytes from the file.
  • He then uses strcmp() to see if this string matches the sought-after file. If not, he continues the fseek().
  • The programmer now skips "entensionsLength" amount of bytes.
  • The programmer reads "length" amount of bytes from the file, and uses this as the sought-after file data.

...performance fascist/programmer who wants low seek times

  • The programmer fseek()'s the file from the end until he finds 0x02014b50.
  • He then uses the following table to get the information he needs:
ZIP central directory file header
Offset Bytes Description
0 4 Central directory file header signature = 0x02014b50
4 2 Version made by
6 2 Version needed to extract (minimum)
8 2 General purpose bit flag
10 2 Compression method
12 2 File last modification time
14 2 File last modification date
16 4 CRC-32
20 4 Compressed size
24 4 Uncompressed size
28 2 File name length (n)
30 2 Extra field length (m)
32 2 File comment length (k)
34 2 Disk number where file starts
36 2 Internal file attributes
38 4 External file attributes
42 4 Relative offset of local file header
46 n File name
46+n m Extra field
46+n+m k File comment
  • The relative file offset can then be used to jump to the file in question. The file is then scanned as described in the previous section.

The 2014/Pyra effort

A new format is being developed for the DragonBox Pyra handheld. Here are the requirements/notes:

Requirement Solutions Importance Status (decided, undecided, post-proned, won't fix, ...) Notes
Should the new format be supported by Pandora AND Pyra?
  • Yes
    • Makes releasing compatible applications easier.
  • no
    • Pandora is softfp and Pyra is hardfp - it's unlikely one package could work on both systems anyway.
    • For current OpenPandora, we can already run Debian on SD-card.
      • Maybe we can provide Pyra apps for OpenPandora using the Debian-on-sd-card opposed to Debian-on-internal-space
Very high Decided: compatibility with SuperZaxxon in the future PND system is not important http://boards.openpandora.org/topic/15613-drop-support-for-superzaxxon-in-new-pnd-format/
API/Daemon to switch between joystick/keyboard mode
Guideline for consistent use of gaming buttons:
  • It should reduce the need of patches on top of the ported software
  • A keyboard/joystick remapping function
  • A guideline for software developers
No single-solution, but a set of them
Easier to parse than current PNDs and less fragile
  • header + metadata (UBJSON) + icon + screenshots + squashfs
  • tar archive
    • more future proof
    • easier for shell scripts
      • with dd you can extract the tar part and so
    • filenames and file permissions are overhead
      • But the overhead is too small to be significant?
    • easier to modify after-compiled
      • not really a problem, because the user should be contributing to the recipe anyway
      • a specialized tool can be create if another solution is chosen
  • zipfs uncompressed filesystem
    • See above sections
  • ar archive
    • Like .deb format
Should the metadata include packages dependencies? Decided
  • After a long heated discussion, general consensus seems to be:
    • Tune the makepnd script to create flat PNDs (packages which only deps are the minimum required set already present) and light PNDs (packages with .deb dependencies).
      • Both packages would be included in the repo and user download what he likes more.
Minimum set of dependencies (SDL, ...) WIP Progress is happenning on github:
It should be click-and-run
  • binfmt
  • Sheebang
  • File (extension) association
Less duplication of metadata (c.f. PXML files)
AUR-like system:
  • Building recipes is easier to replicate and contribute
  • If maintainer is jerk, moderator can move his maintanership
Fork makepkg Cloudef is working on this task
A compatibility layer for the current *.PND system:
  • The more transparent, the better
  • Dual-boot
  • Virtual Machine
  • Tune Debian system to include all libraries used in current SuperZaxxon
    • OpenEmbedded it's based on the Debian, afterall
    • Probably won't happen because of hard/soft float point library issues
  • chroot
    • chroot doesn't start services like the real system, then the services of the chroot must be replaced/configured
High
  • Pyra will have larger internal space, then duplicate system isn't a large problem
  • The compatibility layer should have its own table?
Cross-building? Undecided
Home-dir of applications
  • Maybe one per SD-card
  • Union-fs? Aufs?
  • overridable by env var?
Overridable scripts
  • By creating files with the right name, you can override files in the package (e.g. scripts)
  • Yes:
    • Cool for hackers
  • No:
    • Necessitates a union filesystem
    • Not absolutely necessary - hackers can just rebuild the package.
File-extension?
  • .pyra
    • We're not using fucking DOS for lord-sake, what's the point of use 3-letters extension?
  • .pyd for pyra-data
  • .pyr
  • .pp for pyra program
  • .pa for pyra app
  • .drp for dragonbox pyra
    • Why limit this to just the Pyra? Could be useful in future systems/elsewhere too.
  • ...
undecided. Maybe a series of polls on the boards? ;)
Keep what is good Document all of the good features and port the content to this table High Not started
Extra bonus points for a design that can easily be used on other distros and devices as well Optional Won't fix
  • .deb dependencies already kills the system-agnostic system
    • Unless we submit our changes to upstream Debian, which is unlikely to happen
Multiple binaries/icons per app Somebody (lost on the sea of comments) didn't like this design, but it should be possible for special software
Delta packages http://boards.openpandora.org/topic/15530-the-software-side/page-9#entry311600
PATH handling http://boards.openpandora.org/topic/15530-the-software-side/page-9#entry311618