Golang decoder for deserializing serialized PHP data structures. Because PHP is obnoxious.
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Benjamin Shelton aafdc109ab Updated README with associative array deserialization example. 1 month ago
LICENSE Added LICENSE. 2 months ago
README.md Updated README with associative array deserialization example. 1 month ago
decoder.go Implemented array => map support for associative arrays with string keys 1 month ago
decoder_test.go Implemented array => map support for associative arrays with string keys 1 month ago
go.mod Added package. 2 months ago
parser.go Support zero-length arrays. 1 month ago
parser_test.go Test nested associative arrays. 1 month ago
tokenizer.go Implemented bool type and added unit tests + benchmarks. 1 month ago
tokenizer_test.go Updated unit tests to tests a) complex nested tokenization and b) 1 month ago

README.md

PHPdec(oder)

This library provides an exceedingly simplistic utility for decoding PHP-serialized data structures into Golang data structures. It was written because the current assortment of PHP readers don't (correctly) handle mapping associative arrays into structs. Indeed, of these, only one (!) provides a means of mapping anything into a struct, and only if it's a PHP object (!!). Familiarity with software like WordPress and the wide assortment of other platforms will suggest that this has limited utility; the plurality of serialized data from PHP will almost uniformly be an associative array with mixed value types. Mapping this into a map[string]interface{} (or worse: map[interface{}]interface{}) isn't simply an exercise in frustration--it has performance implications as well! Munging the input by replacing all array instances ("a" literal) with an object ("O" literal, plus some string-y bits here and there) isn't a viable solution either and would require manipulating the entire serialized string.

Usage

PHPdec's API is centered around a single function that follows suit with most of the Golang ecosystem: Unmarshal. Since this API (if one were so generous as to call it such!) is so simple, usage is equally straightforward:

Decode into a scalar type (not terribly useful):

var decoded string
if err := phpdec.Unmarshal(serialized, &decoded); err != nil {
    // Handle error.
}

Decode into map (somewhat useful!):

// Using the following PHP array:
//
// $a = [
//     'page_count' => 42,
//     'views' => 1002,
// ];

decoded := make(map[string]int)
if err := phpdec.Unmarshal(serialized, &decoded); err != nil {
    // Handle error.
}

fmt.Println(decoded["page_count"]) // 42
fmt.Println(decoded["views"]) // 1002

// Recommended:
if v, ok := decoded["page_count"]; ok {
    fmt.Println(v) // 42
} else {
    // Key not found...
}

At present only integer and string map keys are supported. Mixed-type keys are not supported at all meaning that PHP arrays containing a mix of both integer offsets and associative keys may not behave predictably.

Further, while most data types for map values should work--if they're supported by PHPdec--test coverage isn't there yet. Test coverage mostly supports those use cases that we require for other libraries at present; all else is supported strictly as interest or inclination allow.

Decode into struct (much more useful!):

type decoded struct {
    Value int
    NestedStruct *nestedStruct `php:"nested_data"`
}

type nestedStruct struct {
    AnotherValue int
    AnotherString string
}

// Using the following PHP array:
//
// $a = [
//     'value' => 42,
//     'nested_data' => [
//         'anotherValue' => 84,
//         'anotherString' => 'this is just an example',
//     ],
// ];

var ds decoded
if err := phpdec.Unmarshal(serialized, &ds); err != nil {
    // Handle error.
}

fmt.Println(ds.Value) // 42
fmt.Println(ds.NestedStruct.AnotherValue) // 84
fmt.Println(ds.NestedStruct.AnotherString) // "this is just an example"

The astute reader will note that field names may be modified using the struct tag php followed by the target name. If the php struct tag is not provided, the default match for array keys will use the struct field name substituting its first character with its lowercase representation.

Further Examples

For more examples, consider examining the decoder_test.go unit test collection. This includes working illustrations of how to use the encoder for things like arrays and other supported data types.

Pre-1.0 Warning

Please be aware that this is a pre-1.0 library. As such, the API may be subject to change. However, unless some dramatic deficiency is discovered in the existing API (of one function!), it may be assumed that this library will not persist in a state of flux. No intention will be made to expose the underlying tokenizer/parser.

Also, there are presently no plans to implement a PHP serializer, neither is there any intent to implement a stream deserializer. This library is intended to read data from existing PHP applications.

At present, deserialization of PHP objects is not supported. Implementing this feature is planned for the future--most probably when we have a specific need to consume serialized PHP objects--but pull requests are welcome. There is a stub function in parser.go named parseObject that currently exists as a placeholder for this functionality. It would then need to be wired into the parseTreeNode struct, probably as a separate field (objFields of type *parseTreeNode?), and then wired into decoder.go.

License

PHPDec is offered under the NCSA license as with many of my other projects. This means that it's an extremely permissive license similar to both the BSD and MIT licenses and can be used in commercial works. As an added bonus, this covers any documentation as well.