First off, it is important to say that Zigbee and 802.15.4 are not the same thing, even though they show up together very often. IEEE 802.15.4 is the link/physical layer, like WiFi (802.11) and blue-cable Ethernet (802.3). Zigbee is a standard that specifies the network stack above 802.15.4.
There are many stacks based on 802.15.4: Zigbee, XBee, DigiMesh, Thread, etc. Even TCP/IP, as long as it is IPv6, as we will see later. On the other hand, Zigbee only works on 802.15.4, so Zigbee is actually dependent on 802.15.4 in a technical sense, though 802.15.4 takes a marketing ride on Zigbee.
802.15.4 is a PAN ("Personal Area Network") technology, with features that make it different from WiFi and Bluetooth: low speed, low energy demands, network resilience and extremely simple radio stage. It was tailored for the Internet of Things well before IoT became a fad and a household acronym.
802.15.4 supplies the tools for mesh networking, but it does not make the mesh itself. This is left to the implementer of the network layer, like Zigbee. (The upcoming 802.15.5 standard will embed the mesh capability into the link layer.)
Zigbee supplies a complete network stack, top to bottom. Even the application data types and their encodings are rigidly specified.
In spite of the letter soup and the fair number of annoying Z-something acronyms, we can make an analogy of every functional block with some TCP/IP component that most of use are more familiar with:
It is important to note that Zigbee is analogous to TCP/IP, but it is not compatible with TCP/IP. Zigbe messages cannot be directly transmitted through Internet. This limitation may be removed by dotdot (check the 6LoWPAN topic towards the end of the text).
The Zigbee's network layer (NWK) is responsible by forming the mesh. There are three device types in this mesh: coordinator (exactly one per network), routers and end devices. The mesh topology allows a Zigbee network to cover a big area with ease.
Every Zigbee device connected to a reliable energy source (e.g. mains) can be a router — for example, a smart bulb or a 12V-powered switch. An end device is probably powered by battery or even by energy harvesting techniques — for exame, the remote control of the smart bulb. The routers build the mesh, while the end devices are connected to a single router at a time.
Another wrinkle in the router-end device difference is, since end devices are powered off most of the time, the routers answer for them in some situations. This is an important feature of Zigbee, and also a thing that makes the stack more complicated, and a target of technical criticism.
The Zigbee APS transport layer is analog to TCP or UDP. TCP/IP has ports, while Zigbee has endpoints. A single device can implement many applications, each one bound to a different endpoint.
At the application layer, the distance between Zigbee and TCP/IP increases, because in Zigbee everything is in the specification. But we can draw parallels. AF specifies the "envelope" of messages exchanged by an application. We could have similar standardization in TCP/IP if we prescribe that every message must use HTTP, be encoded as JSON and every API is REST.
Every application is called an "Application Object" (AO) in Zigbee. There can be many per device, one per endpoint, not unlike a Web server listening to many ports, each port serving a different REST API.
The ZCL (Zigbee Cluster Library) would be equivalent a "JSON Schema" on steroids. On top of specifying data types and their encoding, ZCL lists all attributes, mandatory or optional, for each cluster. The "cluster" is equivalent to the Bluetooth profile, or to the service of Bluetooth Low Energy. Every officialy sanctioned cluster (that can be certified by Zigbee Alliance) is listed on ZCL document. As you can imagine, this is a thick one, with 1600 pages.
In this aspect, Zigbee is pretty much like GATT from Bluetooth Low Energy, even though ZCL attributes are simpler than GATT ones. In either case, applications have no freedom to conjure new protocols, or data types, or encoding methods; they must be implemented in terms of an attribute list.
For example, the cluster 0x402 (ambient temperature sensor) has one mandatory attribute, whose ID is always 0x00, type is always int16, unit is always Celsius and it is always scaled in hundreths of degree. The value 2000 or 0x07d0 means "20 degrees Celsius". The value 0x8000 (-327.68) is reserved to indicate undefined temperature (e.g. in case of sensor failure). All of this can be found in the ZCL document and of course it is very simple to implement a client for this sensor.
It is allowed for a manufacturer to add extra attributes that are not in the ZCL, as long as the standard ones are there. It is also possible to create "apocryphal" applications with attribute lists completely off the ZCL, but those cannot be certified.
ZCL (and GATT) may take some freedom from application developers, on the other hand they make the application development a lot easier. At the sensor side, normally it is just a matter of grabbing an example application and changing the attribute list. The heavy lifting is already taken care of, either in the app code or in the Zigbee firmware.
The ZDO block (Zigbee Device Object) is an API of base services that every device must implement e.g. device search, query capabilities, query the list of endpoints, etc. In the TCP/IP world, these needs are supplied by a variable set of different protocols (ICMP, DNS, LDAP, etc.).
The Zigbee networks are secure once formed, like Bluetooth ones. The weak spot is the admission of a new device. Security is implemented along with network and transport layers. The direct counterpart in TCP/IP is IPSec. (In TCP/IP, TLS is more often used than IPSec, but TLS lives in the application layer.)
The biggest issue of Zigbee is interoperability, either in "pairing" (network admission of new device) and in application details. If a "Zigbee" device manufacturer only tests its products with themselves an does not certify them (typical of cheaper products), it is easy to overlook some standard deviation that will hinder, or even block, interoperability. One could even do this on purpose (vendor lock-in).
Typically, the Zigbee consumer buys a "starter kit" (e.g. bulbs, switches, one coordinators) from one manufacturer, uses only the kit components for a while, and discovers that his kit is de-facto proprietary when he tries to pair a Philips Hue bulb.
The situation is better in Bluetooth arena because every manufacturer tends to sell just one kind of device. Surely the device has to pair with counterparts from other manufactuers; if that doesn't work, it will be returned as defective by the customer. Most general-purpose operating systems implement at least part of the Bluetooth stack, the OS developers being "neutral" players without any interest in vendor lock-in.
But Bluetooth is not perfect; everybody has some bad experience with Bluetooth devices that don't pair or don't work, mostly in case of Bluetooth Classic (Low Energy is better). It is the kind of thing that does not happen in TCP/IP. Here we can see the advantage of open and free standards, peer-reviewed and implemented as open-source by many entities. And TCP/IP is on the road for 35 years more than Zigbee, after all.
The thing works mostly like in Bluetooth. All the Zigbee official docs are free-as-in-beer but you need to register at Zigbee Alliance site. The "Zigbee" logo in a product demands certification. Draft versions of documents may only be accessed by members, and even the first-tier association fee is US$ 40k.
Not exactly a problem of Zigbee, but equally unavoidable, is the forced contact with Zigbee chips and their respective ecosystems with scarce documentation, Windows-only tools, buggy tools, ultra-expensive licenses, etc. (In this industry, it is not difficult to find someone still running Windows XP because the development/certification tools have problems with any other Windows version!)
I care to mention these things because it is a cultural shock and a nuisance for anyone accustomed to free software and TCP/IP, with its free-as-in-freedom RFCs and RFC drafts.
WiFi. The main advantage of WiFi is the ubiquity. Everybody uses WiFi, everybody knows how to use. There are plenty of WiFi-based IoT devices, but actually it is not adequate for IoT, for many reasons.
Bluetooth Low Energy. BLE cannot form network meshes right now, but it will get this feature in the near future. The GATT application layer is good for IoT. BLE is ubiquitous as well; once it becomes capable of mesh networking, it becomes the most dangerous competitor, IMHO. (Discounting the chance of mesh incompatibility with pre-mesh, older devices. The promise is to support mesh in all devices that can support BLE 4.0... We'll see.)
DigiMesh. This is a proprietary stack from Digi, the manufacturer of the Xbee circuits. DigiMesh looks a lot like Zigbee, with some simplifications e.g. there are only routers (no coordinator, no end devices). Link layer is 802.15.4 as well.
XBee. This name bears different meanings depending on context. XBee is the trademark of Digi's 802.15.4/Zigbee chips. It is also the name of a protocol stack — the first version of XBee is very popular among Arduino users, but it is not 100% compatible with Zigbee proper. The second version of XBee boards is compatible with Zigbee 3.0.
Z-Wave. Conceptually equivalent to Zigbee, though completely proprietary. Created and marketed by Zensys. The specifications used to be secret, but this is slowly changing, with parts being converted to public standards (e.g. the link layer became the ITU G.9959 standard). In fact, because it is proprietary and single-sourced, it ends up having less interoperability problems, so it is considered better than Zigbee in this regard.
Thread. Based on 802.15.4 and 6LoWPAN. Proposed by Nest Labs, that thermostat startup bought by Google. I don't know much about this stack; the main attraction is being based on TCP/IP.
Even though the upper layers of Zigbee stack are completely different from TCP/IP, and general-purpose operating systems don't implement Zigbee, the good news is the developer does not have to deal with these issues. The implementation is supplied by the chip's manufacturer in one way or another. Something similar happens for Bluetoot LE chips.
Besides the implementation being almost-ready-to-run, it may be pre-certified, which boosts the product's certification process. The developer must choose between three basic architectures:
Many manufacturers offer Zigbee chips. But Texas Instruments stands out due to the easy sourcing, low price and great support: documentation, sample code and developer forums.
For now, the most common chips are CC2530 and CC2531. They are "relatives" of the CC2540 (Bluetooth Low Energy) and the microcontroller is 8051-like. The CC2531 variant is a USB dongle. All chips of this series are easy to find, and cheap. They can run a simple application and there are even some general-purpose digital and analog pins. Development and debugging hardware (CCDebugger, cables, etc.) is easy to find as well.
The biggest con of this series is the need of an IAR compiler license, that can easily cost thousands of dollars. A second issue is the small CPU. One workaround is to use the "Network Processor" architecture, the CC253x running the ready-to-use ZNP firwmare, so the application can go into your microcontroller of choice. TI itself used to market a CC2530 kit variant with an auxiliary MSP430.
The ZNP firmware API is rather simple, but far from trivial, and the developer needs to know how Zigbee works. (Other manufacturers like Digi offer chips with higher-level APIs.) Since it is a niche thing, there isn't a lot of code around to reuse or to learn from. For general-purpose computers, we have cc-znp written for Node.js. (The Node community is strange in some ways, e.g. the left-pad controversy, but it is not lazy!) Apart from cc-znp, there is some Java example code floating around, and of course the TI's own source code for the CC2530-MSP430 kit.
In the Arduino world, the XBee boards from Digi are the most often used.
TI also offers newer chips: CC2538, CC2630 and CC2650 with ARM-based microcontrollers. They are more powerful and TI itself offers a free development tool for them. The CC2650 variant implements both Bluetooth LE and Zigbee. Unfortunately, this series is still more expensive, difficult to find, and there is no USB dongle version (the chip supports USB, so one could solder a USB cable). The ancilliary development/debug hardware is also different from the CC25x0 series.
The latter chips have an additional advantage: they can run a TCP/IP stack, since the Contiki operating system (which is sanctioned by TI, too) can be run on them.
Even if Zigbee were perfect, there are always some use cases where TCP/IP would be just better. A TCP/IP-based IoT device is easier to understand and to interact with, and it can connect directly with the Internet (let's set aside the discussion whether it is a good thing or not).
The adaptation of TCP/IP to 802.15.4 is challenging. The small packet size of 802.15.4 (127 bytes) is a major hurdle — IPv6 demands a minimum datagram size of 1280 bytes. The main selling point of 802.15.4 is the mesh network, but the network layer must actually weave the mesh, and vanilla TCP/IP does not know how.
The RFC 4919 specifies the 6LoWPAN adaptation layer. It is general enough to adapt IPv6 to Bluetooth Low Energy as well.
6LoWPAN prescribes two ways of forming the mesh network: mesh-under and route-over. In the former type, the intra-mesh routing is carried out by the adaptation layer itself, and IPv6 thinks that it rides on an "Ethernet" (RFC 4944). In the latter type, the intra-mesh routing is explicit and carried out by the IPv6 network layer. The mesh is formed with the help of a suitable routing protocol (RFC 6550). All nodes connected to mains should be routers (just like Zigbee).
Nest's Thread is based on 6LoWPAN. It is also the case of Zigbee IP. In a nutshell, Zigbee IP makes some choices that are left open by the 6LoWPAN standard e.g. the security mechanisms and the layer responsible by mesh (always route-over in Zigbee IP).
The ZigBee Alliance hasn't released much information about the upcoming dotdot tecnology. It is probably similar to, or even compatible with Thread, allowing the transmission of Zigbee ZCL messages over TCP/IP and over the Internet.