Thursday, 8 November 2018

ESP-Now BATMAN/ELP progress

Over the last few weeks I've been tinkering with ESP-Now and have implemented an interpretation of BATMAN's 'echo location protocol' ELP, as set out here.

As their documents say at the top, this is an old version, but for my purposes it will suffice and I'm not looking to implement something compatible with BATMAN as deployed on other platforms.

I'd definitely class what I've done as an interpretation rather than an implementation. While the documentation suggests ESP-Now can generate broadcasts, which BATMAN specifies for ELP, I missed this on first reading (it's mentioned once in passing) and worked around this. The model for ESP-Now communication is between pre-defined peers so baking in management of these peers as part of my take on ELP is not lost effort. Yet.

In BATMAN each thing on the network is a node. If it's actively routing traffic with BATMAN it is an originator and if it is reachable in one hop it is a neighbour.

ESP-Now's concept of peers maps fairly directly to BATMAN's concept of neighbours. However you can't send data to an arbitrary node without adding it as a peer first. Peers are referred to by their primary MAC address, much like in BATMAN. You can have a maximum of 20 peers unless you use encryption when it's reduced to 10. Which is why making management of ESP-Now peers part of my interpretation of ELP, rather than just broadcasting, isn't a bad thing. You can trivially send packets to every peer and have the ESP-Now library do the management of that process for you, but it's not a broadcast and won't be sent to non-peers.

Something ESP-Now adds that isn't in BATMANs broadcast model for ELP is you get delivery confirmations from peers. Normally BATMAN measures the quality of transmission to first hops on the network with its routing protocol OGM as these packets are echoed back to the sender. With ESP-Now unicast I have been able to get a measure of transmission quality (TQ) from ELP alone.

What ESP-Now doesn't handle, and BATMAN achieves with broadcasts, is initial discovery of peers, there is an expectation this is hardcoded or done via other means of pairing. So I'm using the standard Wi-Fi SoftAP and AP scanning methods from the ESP8266 Wi-Fi libraries. The scan looks for any SSIDs that match what it's looking for and attempt to add the device as a peer if it isn't already. Once a node has some peers it switches its SoftAP off, but among a group of peers at least one leaves the SoftAP on so that group of peers can be found. I might replace this mechanism with something that involves ESP-Now broadcasts, but at the moment it works quite nicely. It's worth noting that the BSSID of an ESP8266 is not it's primary MAC address. For the BSSID they use a locally administered variant of the primary MAC address, so you have to unset the two least significant bits of the second octet to derive the primary MAC address.

As ELP includes a list of neighbours in the packet, I've a mesh where each node is an originator that discovers new neighbours, then is notified of their neighbours. Often some of these are one-hop reachable too so the mesh forms very nicely. The measure of TQ I have works very well to ensure only reachable neighbours are advertised through ELP, so each node only has believably reachable peers added and any that aren't can be aged out.

I also spent a lot of time building a basic user interface into this on the serial console, which you can see in the screenshot. When you're making your own network protocol you need to also make your own tools for monitoring and troubleshooting it. Being able to look at the peer table and logs of every node has made this much less of a head scatching task than it might have been. This UI console code is all wrapped up in conditional compiler directives so I plan on leaving it in place in the code long term. I do have some aggro though as the Windows driver for the USB serial chipset used on the WeMos D1 mini appears to be flaky. At some point it stops sending data to the ESP so you can't control the UI even though you can still see it. This doesn't happen on Linux so I'm pretty sure it's nothing to do with my code.

It's got to the point where I've had up to 12 nodes in a network and they've been up for 100+ hours managing neighbours coming and going without complaint or failure that I can see.

Emboldened by this I did a field test during a game over the weekend. I had four nodes with PIR motion detectors sending packets when triggered to a bare node connected to a laptop I was using as a prop.

This was a total failure. Not massively surprising as I hacked the code together from existing hardware over lunch. It did work when connected to the laptop but when powered by USB charge banks the nodes couldn't even discover each other, which I know works well. As I was mid-game I had no time to troubleshoot.

Assuming that failure was something trivial, I now need to move on to implementing OGM, which will allow the mesh to do proper multi-hop routing. Right now it can only deal with routing two hops to 'peers of peers' and it needs to be able to route end-to-end across the whole mesh. I think I've laid a solid foundation for this with my interpretation of ELP so I'm hoping this won't be too hard work. OGM produces a full routing table to all nodes with a composite TQ to each one made from the TQ of each hop. Once this is done it's only a small step to being able to forward arbitrary packets across the whole mesh.

Tuesday, 16 October 2018

Nanananana

B.A.T.M.A.N.

I think I'm going to have a go at implementing a mesh network scheme/algorithm inspired by the ones used in batman-adv IV for my project. I can't actually implement batman-adv it's too much work and inappropriate for the ESP platform. The bare bones of the mesh algorithm on the other hand looks like a good candidate and unlike a lot of things floating around online this is an active open source project that's been through multiple iterations of working on the difficulties of mesh networking.

I need to think through how ESP-Now not having a broadcast mechanism will affect it. ESP-Now can send a packet to all a node's peers, but that's not the same thing, those peers need to be found and actively added in the first place.

I've done a trivial test of ESP-Now that involves running SoftAP and periodically scanning for the beacon frames, then switching the SoftAP off once you've joined the mesh. However this would very swiftly end up with multiple isolated meshes that can't find each other.

Current thinking is to have some nodes, at least one per mesh, run as SoftAP, to maintain options for peer discovery, spinning up more periodically if none are directly visible from a section of the mesh.

Likewise ESP-Now adds another layer of complication/admin as the peer table is limited to 20. This peer list will have to be pruned/updated, not something that happens in batman-adv, which only considers if a node is one-hop reachable. So there will be three potential states for a node, one-hop reachable peers, one-hop reachable non-peers and multi-hop reachable.

Thing is I know I could fix this easily if I just started sending my own crafted vendor action frames, especially broadcasts, but this way lies an even deeper rabbithole of reinventing the wheel than the one I'm already in. Using the standard SoftAP and Wi-Fi scanning libraries are an expedient way to avoid this I hope.

Monday, 15 October 2018

Cellar refurb

I've not been doing as much project work as usual recently as I've had a specialist company in to 'tank' my cellar.

The cellar has been used for years as my occasional workshop, but has always suffered from being a cold dank nasty place and about once a year it would flood, courtesy of backwash from a drain in the floor. The water would almost immediately run back out of the drain but it left an awful mess behind, occasionally ruining things I'd been stupid enough to leave on the floor.

I've run a dehumidifier 24x7 for maybe five years down there and it only ever managed to keep things vaguely acceptable. Until it flooded, when I'd have to move everything out, clean it all and mop up all the little pools of water left behind on the uneven floor.

I had written off the idea of getting it professionally tanked as too expensive, but when I recently got a quote it was much more affordable than I expected.

The whole project has taken about two months and I've spent lots of my free time in first clearing the cellar out, decorating it and then moving everything back down.

I have purposefully gone with the 'industrial' look of exposed joists to give me a little more headroom and painted everything stark white for light. There is enough height to walk around, unless you're well over 2m tall, but a proper ceiling would have made it quite claustrophobic.

Now the paint and plaster skim has dried out, it sits nicely at the ambient temperature and humidity of the rest of the house, you don't really feel you're in a cellar.

I'm very pleased with my new den.

Monday, 1 October 2018

Am I barking up the wrong tree?

After a successful test of painlessMesh at the weekend I posted about it on the ESP8266 Facebook group and somebody wondered if I'd considered ESP-NOW.

Which I may have seen mentioned a couple of years ago but have never fiddled with and completely forgotten.

While painlessMesh is doing what I want I have in the back of my mind that it's layering heavyweight stuff together (AP associations, WPA2 encryption, TCP servers, NTP-like time sync) to give you a simple messaging protocol.

Despite my assertion I didn't want to get sucked into writing my own networking code I had actually been Googling to see if you could send/receive packets directly between ESP8266 stations without actually associating with an AP at all. Most of the chatter about that is building antisocial Wi-Fi de-auther devices, and it didn't look a very fruitful route.

Of course this is exactly what ESP-NOW is. It's a proprietary messaging protocol that's part of the Espressif suite of dev tools for the ESP8266/32. It leverages standards compliant 'vendor action frames' which allow vendors to send frames directly between stations for their own proprietary reasons.

As this is very low level ESP-NOW supports star networks with 20 nodes, rather than the 5 nodes of the full WPA2 encrypted IP connection that painlessMesh is built on. So a very similar 'star of stars' mesh built out of ESP-NOW connections should scale more gracefully. It also leaves some resource for a conventional IP device to bridge in to the mesh and talk to it. It is unencypted by default but you can switch this on and it lowers the capacity to 10 nodes.

As painlessMesh seems to handle scaling quite well, I'm more interested in the greater power efficiency and range promised by ESP-NOW.

If you've ever noticed you can see a Wi-Fi SSID way outside the range you can make useful connections to it, that's the kind of range ESP-NOW promises. It's not doing all the encryption, background keepalive and management work needed for a full AP association, so it's much more tolerant and fewer packets means less power.

Digging deeper there is also ESP-MESH in the dev tools which the documentation promises is a self organising mesh built out of ESP-NOW connections.

Why have I not noticed these protocols before?

The answer seems to be nobody in the 'maker' community bothers with either ESP-NOW or ESP-MESH. I've found a handful of people talking about it but very much proof of concept builds of the example code, no mention of big mesh projects. I guess the IoT spin that's put on the ESP product lines has meant the non-IP protocol suite is mostly ignored.

As this is all changeable in the code I'm going to continue working with painlessMesh, build the rest of my static nodes and start on the wearable GPS-equipped nodes. This will allow me to conduct a bigger test including them at the end of October. If I subsequently find a way to drop in ESP-MESH or engineer my own thing with ESP-NOW it'll be a 'range upgrade'.

Sunday, 30 September 2018

painlessMesh testing

Yesterday I had a go at testing six painlessMesh nodes in a similar situation to the one I need them for. As per usual this is for a LARP prop and we almost exclusively play outdoors in heavily wooded areas. As I discovered with the last networked thing I made, trees are very good at blocking Wi-Fi so you really need to do testing out in the woods.

For this test I invested a chunk of effort into building these nodes fairly robustly into the IP55 box you can see in my first post about them.

Testing based off almost finalised builds has a risk of being a dead end where you've wasted that time in making them almost complete. However if you're going to put them up in the air on poles spread across a wood
and you can't guarantee it won't rain, they need to be robust and vaguely weatherproof.

Putting them on poles was an attempt to make them inconspicuous to passers by but more importantly to stop the signal being blocked by low level obstacles. I had several sectional aluminium poles that allowed me to get the nodes out of eyeline without being so high they were in the canopy of the taller trees. You really don't notice the ones on poles wandering about and I lost one a couple of times before I made the sensible decision to take waypoints with my phone.

After a little experimentation with node placement I got a corridor along the main play area of our game covered. Once the full mesh was up I was able to send packets across painlessMesh from one end to another. So six nodes isn't enough for what I want to do but now I've seen how it scales I know it can be done without getting overly expensive.

I have ordered another six Wemos D1 mini Pro which should allow me to cover as much of this site as I need. My plan for these nodes is to cover a 'corridor' up the middle of the site linking the key locations, but as this is a mesh, once you add in the nodes carried by players and crew it should only get better.

Node placement was done with the useful behaviour of one of the painlessMesh example programs, where it flashes an LED counting out the size of the mesh, like in the video below. I chopped this example code around a little to include an activity LED and remove the 'hello world' stuff but as these nodes exist just to be part of the mesh the code is essentially done now.


The easily visible node count means you can extend coverage by walking away from the last node and when the count drops back to one, just retrace your steps until it rejoins. This gave me a mostly stable mesh really easily without tons of planning or having to consult a screen.

I'm going back to the site again in October and I plan to have at least ten nodes ready by then, mostly limited by having run out of the nice IP55 boxes I got cheaply from Maplin when they were closing down.

Friday, 21 September 2018

Building a painlessMesh ecosystem

I've spent the last week or so messing about with painlessMesh and it's been a mostly straightforward exercise.

As this is all based around microcontrollers, building any kind of monolithic device with multiple roles requires careful thought so I've stuck with an ecosystem of simple single purpose things.

The idea is also that if these are spread around in use then it extends the mesh, instead of having some central server everything speaks to, too far away to be reached.

The main one is pictured here, it's a combination Lasertag sensor & GPS tracker. OK that's two functions but the whole point of what I'm doing is for this thing to be usable. I need to know where people are and what state they're in.

I've also chucked together the following...

  • Two relay nodes, as seen before, vaguely weatherproof with external Wi-Fi aerials and the option to connect big Yagis.
  • Three compact painlessMesh to USB serial dongles, which allow you to interact with it by sending commands and viewing messages from a computer. These may eventually end up connected to a Raspberry Pi.
  • A logger that writes everything it sees to CSV files on an SD card, one file per node, with timestamps.
All this has been chattering away quite happily but I've got a few minor issues...
  • The Sensor & GPS node gets a bit overwhelmed and rebooted by the ESP8266 watchdog timer, especially if it's printing a lot of debug information over serial. It can manage 8 or more hours active so ditching the debugging info will I hope render it 100% reliable.
  • The USB dongles seem to die when the computer goes to sleep for a long time, which I'm assuming is due to it putting the USB to sleep in some way that they won't recover from.
  • Sometimes things just don't join the mesh until you power-cycle them.

None of this seems at all insurmountable, compared to writing my own mesh network library (again) so I'm sticking with painlessMesh for this project.

Going forward I really need to do some range testing of this and I've got a week to get ready for that.

Depending on how much range I get then I may need to do 'caching' of status messages to presently unreachable nodes. So as things move around they get delivered when they come into range. This is not a feature of painlessMesh so I'd be layering it on top, probably using the SPIFFS filesystem on the ESP8266 as storage if there's not enough free memory. Given these nodes will be attached to people moving around it'd basically be automated sneakernet.

I'm still using human readable, verbose messaging but that may have to go. Or perhaps send semi-readable codes with numeric values encoded as hex. Doing a load of parsing of things like latitude and longitude to/from readable strings, something that's kind of trivial on real computers, starts to chip away at available resource when playing with microcontrollers, especially if you want to buffer these up and store them for later delivery. When I wrote my mesh network for Ciseco xRF radios I agonised over whether to use 12 or 16 byte packets and settled on 12.

Sunday, 16 September 2018

You had one job

For my ESP8266 mesh project I plan on using ESP-01S modules if I can as I've got at least 16 of them kicking around, maybe more.

Should I need more they're the cheapest way to buy an ESP8266 module, especially with through-hole connections.

They're also very compact. The flipside of this is they only have a couple of conventionally usable GPIO pins and are a pain to work with in other ways. You can't use them in a protoboard because of the double row connector layout. You also need to pull several pins high for them to boot up, change that for programming and so on.

Which is why I've got a load of ESP-01S modules kicking around, I now use WeMos D1 mini boards most of the time I want to use use an ESP8266 in something.

With a view to using this stash of components up I bought these little special purpose USB-Serial "ESP-01S link" adaptors.

Which didn't work.

I was surprised as the stuff I get from Banggood is almost without exception good, I don't think I've ever had anything completely duff from them apart from this.

A bit of poking around showed they don't pull the CH_PD pin high, which is needed for the ESP-01S to boot. So I've soldered a little link on the bottom (see picture) and now they're perfect. With a button and a couple of more links they could be configured for programming the ESP-01S but I'm not fussed about that. I want these as diagnostic tools.

I've written some minimal code that lets you use one as a 'bridge' to painlessMesh over a USB serial port. It prints any incoming packets from the mesh in a simple text format and you can send unicast or broadcast packets. An ESP-01S in one of these adaptors makes for a nice tidy little 'dongle' to work with.

I might cook up a simple UI to get a list of nodes, topology and so on, but this is all I need for now.

Friday, 14 September 2018

Is painlessMesh really painless?

I should blog more, I have been doing stuff, honest.

Anyway I have plans for more mesh networked, location aware stuff. While I am going to try and sort out the Tilda EMF badges I have, the proprietary Ciseco radio tech in them is obsolete. It's great but I can't buy any more so need a new, cheap, way to put things together. Then maybe bridge the two technologies together.

Going back to the time when I was using the Ciseco radios, Wi-Fi on microcontrollers was a pain and expensive. Then the ESP8266 arrived and changed that. It's now cheap as chips and a very mature platform for tinkering with. Wi-Fi has loads of downsides for the sort of things I want to do but it is ubiquitous. So I'm going to do some evaluation of it as a basis for my next project of this sort.

I've settled on using the painlessMesh Arduino library to do the lifting for me and have just built a test node comprising of a WeMos D1 mini Pro with external antenna in a waterproof box along with battery, charging connector and some indicator lights.

You can configure painlessMesh nodes as being 'static' so they are more likely to be a hub in the mesh and my plan is to have a few boxes like this to help keep the wearable nodes in touch.

The painlessMesh API also includes some nice functions for finding out what nodes there are and how they're connected so I aim to build a tool that prints that out allowing me to check how well connected the mesh is and also 'ping' things to check them.

If ESP8266/32 and painlessMesh doesn't stack up I'll try LoRa, which should be better suited in some ways. However LoRaWan means playing nicely with the standard for gateways and this is very limited in bandwidth and talk time etc. Using just dumb broadcast LoRa radios would be antisocial to any nearby LoRaWAN and mean writing my own mesh network code. Again.

Monday, 25 June 2018

Resurrecting the Tilda MKe - part 1

A long time back I bought ten surplus EMF camp badges as they were exactly what I wanted as a platform for a networked handheld gadget I could use in LARP.

Now I'm going through a bit of a phase we're I'm reviewing all the stuff I have accumulated and deciding whether to keep it or not for future projects.

Looking at them now they are still pretty much everything you need for making some kind of small device with a basic UI, except Wi-Fi as they used Ciseco data radios instead. However I've a pot full of ESP-01 modules and also a small stash of Ciseco kit despite them going out of business. Tacking an ESP-01 on to one of the serial ports will give it the best of both worlds.

Wi-Fi might be ubiquitous indoors but for networking low resource microcontroller based things in remote outdoor environments it's actually OTT and the range sucks. Things like the Ciseco radios, or nowadays one of the LORA standards make much more sense but being able to connect to Wi-Fi if available is handy.

Which makes them keepers.

So I've made an attempt to do something with one. This particular one I 'killed' by trying to upload a basic 'blink' sketch to it shortly after I bought them and the whole lot have stayed stashed in a box in my cellar since.

The development libraries released at the time are now all rather 'legacy' and tied into a complicated framework that uses FreeRTOS to deliver multi-tasking. So while the Tilda is in principle Arduino Due compatible this makes it more of a job than just adding some libraries and getting on with it.

There's an onboard jumper for wiping the flash and after I used this, uploading a simple sketch to it like it was an Arduino Due got it responsive again. I'm not quite sure how I 'killed' it originally as this wasn't at all obscure or hard.

Unable to import all the board information into modern versions of the Arduino IDE I've been fishing in the header files and managed to make all the buttons & LEDs work, plus the Ciseco radio.

Not much luck with the screen yet, apart from turning the backlight on, as despite wading through the docs I haven't been able to work out which pins the screen driver is connected to. Worst case I'll attempt to disentangle the legacy library and shoehorn it into a modern environment.

There's also an MPU-6050 gyro/accelerometer but I'm not desperately fussed about that working.

If I can make a simple example sketch that drives the screen and most of the stuff I'll post it up.

Wednesday, 23 May 2018

I wanna introduce you to a personal friend of mine


Some time back, a friend who was giving up Lasertag LARP was getting rid of a bunch of their old kit. Amongst it was this 'Pulse Ranger' shell, made by some enthusiasts years ago who started up a small business to give people what quite a few wanted at the time: Pulse Rifles.

Because Aliens.

I've wanted a Pulse Rifle ever since I started sci-fi LARP so when he thrust this into my hands I was overjoyed. I had considered taking one of the 3D models there are kicking around and trying to turn it into something 3D printable on a normal hobby printer and with spaces inside to use, but that's a mountain of work.

These shells are very thick blow moulded ABS and tough as hell so it's weathered the years brilliantly. They're not perfect replicas but ape the silhouette very nicely and with a bit of work it should fill the gap I have in my set of Lasertag weapons.


Inside the shell were the remains of an old Lasertag circuit. It was an old board from an original Starlyte and looked a bit sad so I decided to scratch build a new set of electronics inside. This also gave me the chance to make a modern DoT weapon that does multiple hits and the grenade launcher is a nasty thing.

Also back in 1986, having an LED ammo counter on the side was pretty much the definition of cool. So it had to have something like that too. Nowadays we have so much cheap and useful tech around that sticking a little OLED display in is trivial.

With an OLED in there it opens up the option to display more stuff than just the ammo counter and I've been vaguely toying with the idea of having a UI in a 'tag weapon to select different sounds and so on. This will now be my test bed for that.

The shell doesn't have any meaningful mounts inside for the lens unit, so I 3D printed some spacers that hold it in the 'grip' section snugly when you screw the shell together. I epoxied it to one half of the shell, did it all up and left it overnight, making it rock solid. With no datum lines and the shells being somewhat flexible it's impossible to guarantee it's straight so I'll have to adjust the sight rails to suit once everything's together.

Then it just became a case of bashing all the wiring together, I already had one of my scratch built gun boards. Working with it reminded me that I really really should standardise it and get some proper PCBs printed, given how cheap it is, but one gentle weekend of faffing about and I have a Pulse Rifle.

Feel the weight.