Better partition Bullwinkle.Package and bullwinkle messages sent "over the wire" #20

deldrid1 · 2016-10-04T13:24:00Z

This set of commits provides better partitioning between the role of the Bullwinkle.Package (stored in the sending party's RAM) and the message that is actually sent between agent/device.

The main driver behind this is to allow custom timestamps to be sent with each message. Before, Bullwinkle relied on the message.ts for its internal timeout logic. Now, message.ts can be set to whenever the message was originally generated (say before it was written to spiflash) and Bullwinkle.Package._ts is used for the watchdog, etc.

I've not done a substantial amount of testing (are there any impTests for this lib?) but the update appears to generally be working for me... The README has also yet to be updated with the optional ts value in .send()

This allows for creating globally Unique IDs (assuming the imp hits a reboot every 2,147,483,647 messages, which is over 68 years assuming 1 message/second). By setting prependID to the boot number of the device (which can be stored in a SPIFlashLogger), you can have this property.

Eliminates the number of tries from being sent over the connection to reduce data consumption of the class. Also includes a bugfixes for storing the timestamp in the Bullwinkle.Package instead of the message object

…e server connection is unavailable

This reverts commit d07ea66.

ppetrosh · 2016-10-05T04:08:05Z

Bullwinkle.class.nut

@@ -147,7 +151,6 @@ class Bullwinkle {
            "name": command,
            "data": data,
            "ts": ts,
-            "tries": 0,


This is a good idea to get rid of "tries" here.

There is a comment in the other thread that message.retry and message.tries are items that I don't expect anyone would be using,. These are things I use frequently (in almost every project) to allow for [3] retries before failing with an error message. Be careful.

I should have better stated that I don't expect anyone would be using those fields on the _partner side (although maybe for analytics on the number of failures?).

My opinion is that this data is wasteful and doesn't need to be transmitted over the wire but there may be use cases I haven't thought of as I didn't write the lib originally...

I also feel that way about message.retry, however, I haven't gotten around to making that change based on the push back in the PR so far.

ppetrosh · 2016-10-05T04:15:03Z

Bullwinkle.class.nut

-        if (message.type == BULLWINKLE_MESSAGE_TYPE.SEND) {
-        	message.tries++;
+        if (message.type == BULLWINKLE_MESSAGE_TYPE.SEND && message.id in _packages) {
+		        _packages[message.id]._tries++;


Minor cosmetic: let's please try to keep the size of indents the same across the code, like 4 spaces.

Sorry - copy/paste from the IDE to Atom does some weird things... Getting ElectricImp-Sublime working is a must! :)

Got it. Thanks! I'm working on it...

This change from tries to _tries breaks my existing code.

Is it worth making the breaking change to reduce overhead on the wire? (I would think from imp's perspective for "free" developer devices and your paying customer's perspective for large device deployments, that would be a win...)

@blindman2k - could the onFail callback function be updated to something like

.onFail(function(err, message, retry, numTries) { // Try sending the message again in 60 seconds, assuming we've not made more than 3 tries if(numTries <3){ if (!retry(60)) { server.error("No more retry attempts are allowed"); } } })

I'm also curious as to why you look at message.tries directly instead of just passing the maxRetries setting and let the class take care of it for you? I've not used the retry functionality too much myself (TCP typically takes care of it for me, or I just assume the message will fail and I need to deal with it some other way) so I may be missing something...

a/ The maxRetries parameter came after the tries field
b/ If I remember correctly the maxRetries doesn't inform the caller when it stops retrying. Not 100% on this.

Breaking developers code is not something I would recommend at this stage.

ppetrosh · 2016-10-05T04:28:14Z

Bullwinkle.class.nut

@@ -385,7 +395,7 @@ class Bullwinkle {

            // Add the retry information
            message.retry <- {
-                "ts": time() + timeout,
+                "ts": ( Bullwinkle._isAgent() ? time() : hardware.micros()/1000000 )  + timeout,


In general this looks a little bit confusing to me. We have Package.ts and Message.ts. It's almost impossible to remember which of them is used for what. If one needs to have a custom timestamp for the data, from my POV it should go to the data payload and be handled on the app level as well as all the custom information.

@betzrhodes, @blindman2k what do you think?

I think I disagree? We have a Package._ts (used for Bullwinkle internal logic) and we have a message.ts that corresponds to the timestamp that is associated with the message.data.

I don't disagree that the ts could go in the data payload, however, message.ts is a key that much of my application code relies on (and I would assume others rely on it as well) so I don't think that removing its use in the lib should be taken lightly.

My point is that we should probably try to keep all the "custom" and application specific stuff in the message payload.

And regarding this particular change I'd recommend to keep the _ts for internal/generic purposes and build all the application specific logics on the custom message payload fields.

I'm not opposed to moving the retry data structure over to the Package - just work that I haven't done yet and wasn't necessary to accomplish what I am doing with impPager.

message.retry and message.tries are items that I don't expect anyone would be using, but message.ts is something that I would think lots of folks are using.

Either way it is probably a major bump to the lib since there could be user breaking features (if people are using those message key/values) so if you want to remove the ts from the message and make it be a part of the payload that is fine with me. I'll just wrap .send() and ensure that a ts is always added to the data :)

No-no, I don't want to remove .ts from the message. Neither I want to change it's semantics (what essentially the pull request does). My proposal is that if someone needs a custom timestamp, then he/she should be using the message payload to add it. Does it make sense?

I am confused by this change ... time() is seconds from epoch (or from boot if the device hasn't been online since power-up) and hardware.micros() microseconds from boot but wraps around every 36 minutes. How is this change not going to make a massive mess?

It seems like to me message.ts should be the eventTime in epoch seconds. This provides the timestamp for the data and should be what is transmitted to the _partner for it to do whatever useful thing might be done with a timestamp that corresponds to the data.

For this pull request, package._ts is used only for timeouts and calculating latencies. time() by itself doesn't give very useful info as it is only precise down to the second, hence the need to go to hardware.millis() or hardware.micros(). time() also doesn't tick when there is no RTC present. Since 36 minutes seems like significant overkill for a timeout or for a network latency, I went with hardware.micros().

I agree that time() is a pain, especially for code that executes before going online. You are probably right that 36 minutes is enough time for most but not for all. At the very least we would need to warn (or prevent) the developer not to set a messageTimeout greater than 2160.

Also, watchdog is no longer valid as it uses time() instead of micros().

We could use hardware.millis() to would provide a much larger window (25 days), however, I chose hardware.micros() to provide some amount of consistency between the agent and device...

In my very ugly code, watchdog is still valid. If time() is invalid and not ticking, the message fails before it ever gets to the watchdog. If time() is ticking, the watchdog still works correctly by doing all of the nasty split() calls.

I would prefer to use millis() and you could always pad it with 0's. But it's not something you can compare to the agent so it hardly makes a difference.

Argh, this is a mess.

When used with some kind of nv storage, this allows for preventing ID collisions on the Agent or with something like impPager.

ppetrosh · 2016-10-05T17:02:08Z

Bullwinkle.class.nut

+            "maxRetries":         ("maxRetries" in settings) ? settings["maxRetries"].tostring().tointeger() : 0,
+            "autoRetry" :	        ("autoRetry" in settings) ? settings["autoRetry"] : false,
+            "lowMemoryThreshold": ("lowMemoryThreshold" in settings) ? settings["lowMemoryThreshold"].tointeger() : 15000,
+            "firstMessageID":     ("firstMessageID" in settings) ? settings["firstMessageID"].tointeger() : 0


Can you please elaborate, what is this for? Thanks!

firstMessageID is designed to allow semi globally unique message IDs and limit the chance of collisions (in case you are keeping up with the IDs on the Agent, etc. to know if you have missed a message).

Bullwinkle has a pool of 2,147,483,647 IDs that it can use. Based on the RAM/SPIFlash limitations of Agent/Device, you are never going to have that many valid IDs available to compare against (although over a course of weeks/months you may cycle through them). This just allows you to have bullwinkle's ID generation start wherever you want it, instead of at 0. (For impPager testing, seeing message ID 0 after multiple reboots in a row wasn't terribly helpful for debugging)

See line 67 - this ought to be refactored to just be used directly there (there is no reason to keep it in this settings object as it is only used in the constructor), however, it does make the code a bit clearer for someone who is reviewing the source code instead of the readme.

Ok, got it. Thanks for the explanation. Let's please describe all these new configuration settings in the doc, when we all agree on the changes.

ppetrosh · 2016-10-05T17:07:43Z

Bullwinkle.class.nut

+		        _packages[message.id]._tries++;
+        }
+
+        if(imp.getmemoryfree() > _settings.lowMemoryThreshold && (_isAgent() || server.isconnected())){


I believe we discussed this awhile ago and agreed to keep track of available memory on the app side. But I might be wrong. Not sure if this is generic enough to be a part of the library but don't want to push back hard here.

It was left rather open ended in Issue #18. This is essentially a 2 line implementation that adds very little library bloat, so I went ahead and snuck it in.

If we end up adding this feature into the library, my preference would be to at least keep the default behavior unchanged, i.e. have this flag equals to 0.

Forgive my bias towards wanting the feature :)

That seems reasonable to me to preserve existing functionality.

Sneaky, very sneaky.
I suggest you set the default value at 0 instead of 15000 to ensure that functionality is the same.

ppetrosh · 2016-10-05T17:09:24Z

Bullwinkle.class.nut

-                    delete __bull._packages[message.id];
-                }
+            local ts = "retry" in message ? message.retry.ts : split(package._ts, ".")[0].tointeger(); //Use either the retry ts or the package ts time(), but NOT the message ts so that it can be set for whenever the data was generated, instead of when Bullwinkle attempted to send it
+            if (t >= (ts + _settings.messageTimeout) || t == 946684800) { //RTC is invalid, which implies we have no connection and should retry immediately.


946684800 should probably be a constant.

const RTC_INVALID_TIME = 946684800; //Saturday 1st January 2000 12:00:00 AM UTC - this is what time() returns if the RTC signal from the imp cloud has not been received this boot.

seem reasonable?

Cool, thanks!

Namespace ...
const BULLWINKLE_ RTC_INVALID_TIME = 946684800; //Saturday 1st January 2000 12:00:00 AM UTC - this is what time() returns if the RTC signal from the imp cloud has not been received this boot.

blindman2k

The use of hardware.micros() is invalid. The changes to tries is a breaking change that is unnecessary.

blindman2k · 2016-10-05T22:48:59Z

Bullwinkle.class.nut

@@ -21,7 +21,7 @@ const BULLWINKLE_WATCHDOG_TIMER = 0.5;


 class Bullwinkle {
-    static version = [2,3,1];
+    static version = [2,3,2];


This should be:
static version = [2,4,0];

Based on where the pull request ends up, it might ought to be [3,0,0]...

I think we may reserve v3.0.0 for a complete rewrite :)

I'm just (more than gently) trying to nudge you in that direction ;)

I am not the one you need to convince 😸

blindman2k · 2016-10-05T22:57:10Z

Bullwinkle.class.nut

@@ -95,10 +99,11 @@ class Bullwinkle {
    // Parameters:
    //      name            The message name
    //      data            Optional data
+    //      ts              Optional timestamp for the data


It is unclear what the developer is expected to do with this new field nor what effect setting it will have

I suppose having proper docs would help with that :)

Its just a timestamp associated with the data. If not passed in, Bullwinkle will use the time that .send() is called (which is the v2.3.1 behavior) but this change allows for any timestamp that a developer wants to use.

Imagine a battery powered device that records data 1/minute and writes it to nv/spiflash. Then 1/hour it connects to WiFi and send all of the data. Probably in this case you would just send one big data structure of value/ts but you could send 60 bullwinkle messages with the correct timestamp as part of the message.

The actual reason I went through the trouble of making these changes is for impPager. With that class, we write data to spiflash when bullwinkle.onFail happens. I then read 1 datapoint at a time and send it through bullwinkle (to avoid crashing out of memory if I send all the data at once). Here, I would like the ability to send the data with the original timestamp the data was created, not the timestamp that I call bullwinkle.send() for the nth time.

Why not put the timestamp in your data instead of in Bullwinkle's?

Fully agree with Aron. This is what I'd suggest too.

That is fine (although not what I have implemented currently). If this is the path forward, I still don't think .ts needs to be sent across the wire or stored twice in RAM... Although that may be a v3 feature

You might not but others might 👯‍♂️

blindman2k · 2016-10-05T22:59:14Z

Bullwinkle.class.nut

+        if(imp.getmemoryfree() > _settings.lowMemoryThreshold && (_isAgent() || server.isconnected())){
+          _partner.send(BULLWINKLE, message); // Send the message
+        } else if(message.id in _packages){ //run the failure flow (if the package exists)
+          local reason = imp.getmemoryfree() <= _settings.lowMemoryThreshold ? BULLWINKLE_ERR_LOW_MEMORY : BULLWINKLE_ERR_NO_CONNECTION


This technique is a bit lazy. It's not a big issue now but it may become one in the future.

blindman2k · 2016-10-05T23:14:53Z

Bullwinkle.class.nut

@@ -561,9 +595,7 @@ class Bullwinkle.Package {
            local d = date();
            return format("%d.%06d", d.time, d.usec);
        } else {
-            local d = math.abs(hardware.micros());
-            return format("%d.%06d", d/1000000, d%1000000);
+            return format("%d.%06d", time(), hardware.micros());  //this can be a bit of an ugly _ts but it allows us to calculate latencies up to 36 minutes long...


Confirmed, it is seriously ugly. You have littered the code with split() calls by doing this. I suggest having _ts as an array so you don't have to keep splitting it or make a function to split it like you have a function to create it.

The problem is it is NOT a valid use of hardware.micros(). On the agent it is valid, on the device it is not. hardware.micros() is not a sub-value of time().

True - it is not a subvalue of time and array makes more since. However, I was trying to keep consistency between the agent and device code implementations as much as possible (package._ts is always a string as opposed to a string on the agent and an array on the device).

Timeouts work off of the integer time() in seconds, and getLatency knows what is going on, but the implementation is confusing and ugly.

Since ._ts is a "private" _property, people shouldn't be depending on it and we can likely move to an array?

Oh I see, getLatency only uses the seconds on the device. I missed that. Now, it's more accurate but even more ugly!

deldrid1 added 7 commits September 27, 2016 13:05

Bump Version

fb1c863

Allow custom timestamps to be used with .send(name, data, ts)

e58d826

Move message.tries into package._tries

348fd2c

Eliminates the number of tries from being sent over the connection to reduce data consumption of the class. Also includes a bugfixes for storing the timestamp in the Bullwinkle.Package instead of the message object

Move _packageTimeout into its own workflow and call immediately if th…

dabe747

…e server connection is unavailable

Revert "Add prependID to settings"

4784503

This reverts commit d07ea66.

Bug fixes identified during testing

e2149af

deldrid1 mentioned this pull request Oct 4, 2016

Update to rebuild timestamps when no RTC is available at boot deldrid1/ImpPager#1

Open

deldrid1 added 2 commits October 4, 2016 15:21

Bug fix for usage on Agent

e5d00f7

Add lowMemoryThreshold setting to address electricimp#18.

adb215a

ppetrosh reviewed Oct 5, 2016

View reviewed changes

deldrid1 added 3 commits October 5, 2016 11:45

Bug fix for messages not calling their onFail handlers after a timeout

82ce3b4

Add _checkTimer for all imp.wakeup's

b9b0155

Add firstMessageID setting to Bullwinkle.

5e91b88

When used with some kind of nv storage, this allows for preventing ID collisions on the Agent or with something like impPager.

ppetrosh assigned ppetrosh, blindman2k, betzrhodes and terrencebarr Oct 5, 2016

ppetrosh reviewed Oct 5, 2016

View reviewed changes

blindman2k suggested changes Oct 5, 2016

View reviewed changes

bugfix for ts

b1514ce

Better partition Bullwinkle.Package and bullwinkle messages sent "over the wire" #20

Are you sure you want to change the base?

Better partition Bullwinkle.Package and bullwinkle messages sent "over the wire" #20

Conversation

deldrid1 commented Oct 4, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

deldrid1 Oct 6, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ppetrosh Oct 5, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

blindman2k left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

deldrid1 commented Oct 4, 2016 •

edited

Loading

deldrid1 Oct 6, 2016 •

edited

Loading

ppetrosh Oct 5, 2016 •

edited

Loading