Monday, 17 December 2018

M5stack camera

I have been impressed with the projects out there to make a webcam out of an ESP32 and generic camera module.

Thinking of the faff in putting this together tidily I've not bought the bits, which is good as M5stack released a complete module for a very sensible £10.

It really is very small. More when I can fiddle with it.

ESP8266 brownout

As power consumption and battery life are on my mind I did an unscientific brownout/rundown experiment with an ESP8266.

I took an ESP-01S module and a couple of part used AA batteries and left it to run sat on my mesh network until it stopped responding.

This won't be the first time somebody has tried this kind of thing but I was impressed it worked down until almost 1.8v. It ran for ten hours on a pair of batteries that weren't great to begin with, coming in at 2.7v when I started.

It would have been sending packets every few seconds all this time, so it's not like it was sat there doing nothing.  I know my code currently causes quite heavy power use, averaging at about 80mA.

This is telling me that my aspiration to run a wearable mesh node 'all day' on normal alkaline batteries is almost certainly achievable. A twin AA battery box is not egregiously large, I can probably desolder the onboard LEDs in a final version and maybe improvements in power management in the code will have some effect.

Of course I'm currently ignoring that I want to connect a GPS module to the wearable and this will eat a consequential amount of power, but I'll worry about that later.

A LiPo battery is what most people would go with but I prefer the wearables to have field replaceable batteries. This is because they will literally be used in a field/forest, the sort of place where you worry about being able to charge your phone as the evening draws in. Having a few AAs to hand is very easy to manage if something goes flat.

For my next experiment along these lines I'll try the same with a 18650 cell and 3.3v regulator. If AAs won't cut it 18650 cells are the sane removable LiPo option in my opinion. There are also 14500 cells which are AA sized so very convenient but these have a much lower capacity.

Sunday, 16 December 2018

Obscure Arduino tips #1

Want to know exactly which ESP8266 board your sketch is compiled for in the Arduino IDE and act on that in your sketch?

Why do I need this? I'm building for a couple of different flavours of ESP8266 boards and wanted to know which board is the target so I can change a value to match the board automatically.

I'm using the ESP8266 core as an example but this should also apply to other boards supported in the Arduino IDE with some tinkering.

  • Find the file 'boards.txt' in ESP8266 Arduino core. On Windows this will be somewhere like "C:\Users\YOUR USERNAME\AppData\Local\Arduino15\packages\esp8266\hardware\esp8266\2.4.2\boards.txt"
  • Search for the name of your board as shown in the Arduino IDE, for example "LOLIN(WEMOS) D1 mini Pro". You should find a line that looks like "d1_mini_pro.name=LOLIN(WEMOS) D1 mini Pro"
  • Immediately below this there should be line similar to "build.board=ESP8266_WEMOS_D1MINIPRO".
  • The compiler passes the build.board value on as a #define but it prepends "ARDUINO_".
  • So if "ARDUINO_ESP8266_WEMOS_D1MINIPRO" is defined in your code you know it has been compiled for a Wemos D1 mini pro.
I am using this so I can do a readVcc() and get an accurate value for Vcc across a couple of different flavours of module. Here's a usable code fragment...


float vcc()
{
  uint16_t v = ESP.getVcc();
  #ifdef ARDUINO_ESP8266_WEMOS_D1MINI
    return((float)v/918.0f);
  #else
    #ifdef ARDUINO_ESP8266_WEMOS_D1MINIPRO
      return((float)v/918.0f);
    #else
      #ifdef ARDUINO_ESP8266_GENERIC
        return((float)v/1024.0f);
      #endif
    #endif
  #endif
}

The slightly different values are because the boards have a slightly different set of resistors on the Vcc monitoring connection and these are the values I've found generate realistic readings verified by a meter.

The IDE also sets the environment variable ARDUINO_BOARD, which you can use, for example...

Serial.print(ARDUINO_BOARD);


You could use this to avoid wading through boards.txt, or code the above example differently.

Saturday, 15 December 2018

ESP-Now BATMAN test rig

Unusually, I posted one of my videos publicly and I got a good swathe of comments so I think I'll do this more often in the future.

One of the questions was how big does the mesh scale?

Frankly I don't know.

My aim is to support about 40 nodes actively sending data every few seconds because location tracking is my primary goal. This feels achievable.

However to prove this I'll either have to build it or use some network modelling software. I like building stuff.

I have over time bought quite a few ESP8266 modules for various projects, a lot of them speculatively or for things that are over and done with.

  • 12x Wemos D1 Mini
  • 12x Wemos D1 Mini pro
  • 10x ESP-01S 1MB flash
  • 10x ESP-01S 512KB flash
So when I allow for a few I've lost, given away or killed that's about forty.

Plugging all this in at the same time would be a pain in the neck. So I've built a little test rig that allows me to get fourteen ESP-01S running off my bench power supply.

This circuit is simply a bare minimum of suplying power and a pullup resistor to CH_PD so it boots. In principle you need pullups on GPIO0 and GPIO2 as well but in my experience they boot fine with these pins left floating, at least reliably enough for a little testing.

This took a few leisurely hours to build as there was quite a chunk of soldering, especially as I added individual on/off switches.

Running off my bench PSU it draws 1.1-1.2A which is quite a lot. I need to work on my power efficiency, but given my desired runtime is 'all day' I reckon I can get there. The outdoor nodes lasted about eight hours, which is about in line with this power draw given the rubbish batteries I used. 

Compared to a sensor that sleeps almost all the time and draws microAmps when doing so this power usage is awful, but my kit just can't sleep as it has other jobs to do, one of which is always being there to relay traffic for other nodes.

If the nodes were connecting to APs, the various radio sleep modes would reduce consumption massively as the DTIM table means they know when to wake up. Without an AP to manage this I'd have to create my own scheduling algorithm equivalent to DTIM. This is a job for later if I can't make improvements in other ways. I've already got a time sync protocol it might just need to be more accurate.

I did a little video of this test rig running...



Tuesday, 11 December 2018

ESP-Now BATMAN time sync

One of the things I really liked about PainlessMesh was it had a built in time protocol that synced across all the nodes. If you're building a mesh of interacting things then having them share a common clock is useful.

The PainlessMesh developers have gone the whole hog and implemented an NTP inspired protocol taking into account latency and jitter of communication between nodes. While I don't have the enthusiasm to do this, I have come up with a simple clock syncing option that seems to work fine at low mesh sizes.

I was already sharing uptime information across the mesh so my very simple scheme is as follows...

  • The node with the highest uptime is considered the time server.
  • All other nodes work out their offset from this figure as NHS packets come in.
  • There's then a function that returns this calculated 'mesh time'.
  • If the clocks drift then every NHS packet from the time server tweaks it back into line.
  • If the time server goes away, the node with the next highest uptime takes over, faking its own uptime to be what it understands 'mesh time' to be.
  • If the previous time server comes back the current time server stops.
This simplistic approach seems to be working just fine so far and the sum of all clock drift corrections over several hours is in the tens of milliseconds. Which means it really doesn't need to sync very often.

The nodes aren't perfectly in sync down to the millisecond (mostly because ESP-Now iterates through sending packets to its peers) but nobody is going to notice in real use. This is entirely about making events happen in sequence on human timeframes. It doesn't need to be more accurate to achieve this.

I've done a little video demo of it syncing up.


Sunday, 9 December 2018

ESP-Now BATMAN first field test

It was our end of year LARP social event this weekend which includes a little bit of shooting at each other in the woods with Lasertag guns, so I took advantage of this to do a field test of my code and hardware.

With the current sketch loaded onto the six nodes I originally built for testing PainlessMesh I headed off to Banbury. It's been a real success, with a couple of provisos.

First the good news. The range of five of the nodes was very solid, I could see direct ELP neighbour discovery packets from them using an ESP-01 plugged into my laptop inside a large wooden hut. This was just with the trace antenna. I think the sixth node has a fault where I've moved the link to connect the external antenna. This node had crappy connectivity despite being no further way from its neighbours than the others.

Secondly propagation of OGM packets works perfectly, I was getting packets forwarded from every node and the ESP-01 connected to my laptop looked to be making good routing decisions.

Finally I got almost eight hours runtime out of at least one of the nodes with nasty cheap pound shop NiMH AA batteries in the freezing cold and pouring rain. This bodes well for better batteries and my target is only to have 'all day' life from a set, fitting fresh at the start of each day. When I got home and opened the nodes they were all bone dry, despite my vague worry they'd leak at the antenna.

The big proviso is this was only a small site (42 acres) so I simply couldn't expect much in the way of a range test. I think I could have achieved coverage of this site with PainlessMesh.

The other proviso is I've now got eight hours of logs to wade through before I'm sure what I've just said is true. Mostly I was there to socialise so peering at a screen for hours wasn't going to happen.

I now have until March for the next field testing opportunity, unless I make a special trip somewhere.

In the meantime I need to make a device that actively uses the mesh so this isn't all just peering at MAC addresses and routing tables. Given my end goal is GPS tracking and status reporting I think it's time to build a minimal GPS tracker that reports back to a specific node.

Monday, 3 December 2018

ESP-Now BATMAN/NHS progress

I'm still wrestling with making what I hope will be a fully featured mesh protocol for battery powered ESP8266 devices.

Having put in the main building blocks of ELP and OGM from BATMAN IV I wanted to add something of my own to monitor the 'health' of the devices. I've coined the term Node Health Status (NHS) for these packets. Because British.

This is relatively simple just now, it periodically sends uptime, free heap memory, starting heap memory, supply voltage, number of peers and radio transmission power to all its neighbours. Nodes store this and make it persistently available.

My reasoning for doing this is it will allow me to 'visualise' the network more accurately without having direct access to each node and certain features need an election process to decide which nodes to run on.

The first use I'll have for this is that nodes with fewer peers will run as a SoftAp to help discovery of new nodes. I'm hoping having fewer peers is a passable heuristic for being somewhere 'sparse' on the network.

Monitoring of supply voltage and uptime give simple measures of battery health and if a node has rebooted. Heap monitoring is something that may feed into feature elections.

I also spent a fair bit of time playing with management of transmission power. So long as it is not experiencing consequential packet loss each node lowers power until packet loss starts. Then it raises power and sets a 'floor' it will not go below. Periodically it lowers the floor slightly to see if power can go down further, but will immediately raise it again if packet loss starts. If we had an easily accessible RSSI measure for ESP-Now peers this wouldn't be necessary but the only RSSI information exposed to us in the standard libraries is the RSSI for an AP association.

It might be possible to infer from transmission power, number of peers and so on whether a node actually has to act as a router or not, switching that off/on to conserve resource. However that's probably out of scope for now and is making a big change to the BATMAN IV routing algorithm that would have big unintended consequences.

Saturday, 17 November 2018

ESP-Now BATMAN/OGM progress

More dull text to look at but this is the output from one of my mesh nodes now I have the OGM routing protocol working.

I was right in thinking that getting ELP working nicely would lead to being able to implement an interpretation of OGM pretty smoothly. Again this is not an exact implementation as my use case is different but it is quite similar to the documentation. I'm using the TQ measure in BATMAN IV, not the throughput model in BATMAN V.

Now, even with sharing of neighbours under ELP disabled the whole network knows about all the nodes whether they're reachable in one hop or not. If I turn neighbour sharing on again in ELP the mesh should build more quickly with more redundant links.

It appears that the routing table picks a good route to each node, including a two-hop route where there is a poor reliability via a one-hop route.

Next step is implementing my own 'ping' and 'traceroute' style functions so I can verify this, but again because I have already implemented discovery of routes and packet forwarding this should be not overly painful. BATMAN routing is intended to be simple.

Once that's done I'll make as big a network of real nodes as I can manage, which I hope to get up to about twenty-five now some more Wemos D1 Mini Pro have arrived, and see what happens.

Thursday, 8 November 2018

ESP-Now BATMAN/ELP progress

Over the last few weeks I've been tinkering with ESP-Now and have implemented an interpretation of BATMAN's 'echo location protocol' ELP, as set out here.

As their documents say at the top, this is an old version, but for my purposes it will suffice and I'm not looking to implement something compatible with BATMAN as deployed on other platforms.

I'd definitely class what I've done as an interpretation rather than an implementation. While the documentation suggests ESP-Now can generate broadcasts, which BATMAN specifies for ELP, I missed this on first reading (it's mentioned once in passing) and worked around this. The model for ESP-Now communication is between pre-defined peers so baking in management of these peers as part of my take on ELP is not lost effort. Yet.

In BATMAN each thing on the network is a node. If it's actively routing traffic with BATMAN it is an originator and if it is reachable in one hop it is a neighbour.

ESP-Now's concept of peers maps fairly directly to BATMAN's concept of neighbours. However you can't send data to an arbitrary node without adding it as a peer first. Peers are referred to by their primary MAC address, much like in BATMAN. You can have a maximum of 20 peers unless you use encryption when it's reduced to 10. Which is why making management of ESP-Now peers part of my interpretation of ELP, rather than just broadcasting, isn't a bad thing. You can trivially send packets to every peer and have the ESP-Now library do the management of that process for you, but it's not a broadcast and won't be sent to non-peers.

Something ESP-Now adds that isn't in BATMANs broadcast model for ELP is you get delivery confirmations from peers. Normally BATMAN measures the quality of transmission to first hops on the network with its routing protocol OGM as these packets are echoed back to the sender. With ESP-Now unicast I have been able to get a measure of transmission quality (TQ) from ELP alone.

What ESP-Now doesn't handle, and BATMAN achieves with broadcasts, is initial discovery of peers, there is an expectation this is hardcoded or done via other means of pairing. So I'm using the standard Wi-Fi SoftAP and AP scanning methods from the ESP8266 Wi-Fi libraries. The scan looks for any SSIDs that match what it's looking for and attempt to add the device as a peer if it isn't already. Once a node has some peers it switches its SoftAP off, but among a group of peers at least one leaves the SoftAP on so that group of peers can be found. I might replace this mechanism with something that involves ESP-Now broadcasts, but at the moment it works quite nicely. It's worth noting that the BSSID of an ESP8266 is not it's primary MAC address. For the BSSID they use a locally administered variant of the primary MAC address, so you have to unset the two least significant bits of the second octet to derive the primary MAC address.

As ELP includes a list of neighbours in the packet, I've a mesh where each node is an originator that discovers new neighbours, then is notified of their neighbours. Often some of these are one-hop reachable too so the mesh forms very nicely. The measure of TQ I have works very well to ensure only reachable neighbours are advertised through ELP, so each node only has believably reachable peers added and any that aren't can be aged out.

I also spent a lot of time building a basic user interface into this on the serial console, which you can see in the screenshot. When you're making your own network protocol you need to also make your own tools for monitoring and troubleshooting it. Being able to look at the peer table and logs of every node has made this much less of a head scatching task than it might have been. This UI console code is all wrapped up in conditional compiler directives so I plan on leaving it in place in the code long term. I do have some aggro though as the Windows driver for the USB serial chipset used on the WeMos D1 mini appears to be flaky. At some point it stops sending data to the ESP so you can't control the UI even though you can still see it. This doesn't happen on Linux so I'm pretty sure it's nothing to do with my code.

It's got to the point where I've had up to 12 nodes in a network and they've been up for 100+ hours managing neighbours coming and going without complaint or failure that I can see.

Emboldened by this I did a field test during a game over the weekend. I had four nodes with PIR motion detectors sending packets when triggered to a bare node connected to a laptop I was using as a prop.

This was a total failure. Not massively surprising as I hacked the code together from existing hardware over lunch. It did work when connected to the laptop but when powered by USB charge banks the nodes couldn't even discover each other, which I know works well. As I was mid-game I had no time to troubleshoot.

Assuming that failure was something trivial, I now need to move on to implementing OGM, which will allow the mesh to do proper multi-hop routing. Right now it can only deal with routing two hops to 'peers of peers' and it needs to be able to route end-to-end across the whole mesh. I think I've laid a solid foundation for this with my interpretation of ELP so I'm hoping this won't be too hard work. OGM produces a full routing table to all nodes with a composite TQ to each one made from the TQ of each hop. Once this is done it's only a small step to being able to forward arbitrary packets across the whole mesh.

Tuesday, 16 October 2018

Nanananana

B.A.T.M.A.N.

I think I'm going to have a go at implementing a mesh network scheme/algorithm inspired by the ones used in batman-adv IV for my project. I can't actually implement batman-adv it's too much work and inappropriate for the ESP platform. The bare bones of the mesh algorithm on the other hand looks like a good candidate and unlike a lot of things floating around online this is an active open source project that's been through multiple iterations of working on the difficulties of mesh networking.

I need to think through how ESP-Now not having a broadcast mechanism will affect it. ESP-Now can send a packet to all a node's peers, but that's not the same thing, those peers need to be found and actively added in the first place.

I've done a trivial test of ESP-Now that involves running SoftAP and periodically scanning for the beacon frames, then switching the SoftAP off once you've joined the mesh. However this would very swiftly end up with multiple isolated meshes that can't find each other.

Current thinking is to have some nodes, at least one per mesh, run as SoftAP, to maintain options for peer discovery, spinning up more periodically if none are directly visible from a section of the mesh.

Likewise ESP-Now adds another layer of complication/admin as the peer table is limited to 20. This peer list will have to be pruned/updated, not something that happens in batman-adv, which only considers if a node is one-hop reachable. So there will be three potential states for a node, one-hop reachable peers, one-hop reachable non-peers and multi-hop reachable.

Thing is I know I could fix this easily if I just started sending my own crafted vendor action frames, especially broadcasts, but this way lies an even deeper rabbithole of reinventing the wheel than the one I'm already in. Using the standard SoftAP and Wi-Fi scanning libraries are an expedient way to avoid this I hope.