Web.post chunks get forwarded immediately in contradiction to documentation

The documentation for Web Transactions states that "web.post supports sending payload fragments and can assemble these fragments before routing to your cloud application." This implies that each fragment is gathered and collected into one prior to routing to the cloud application.

However, in practice, it seems that each fragment is forwarded to the Web application immediately. For example, I’m attempting to send 8KB chunks.

{'req': 'web.post', 'route': 'MyRoute', 'name': somepath/', 'total': 10068, 'payload': '8192bytes', 'offset': 0, 'status': 'payloadmd5'}

This causes the fragmented payload to be send to the upstream Web application immediately and does not wait for the second fragment.

Is this expected? If not, what appears to be missing in my web.post request that would cause it to have this issue?

Doh! After posting this I did a bit more digging and realized that the fragmented payload feature is part of firmware 3.2.1 and my device was running 1.5.6!

After updating to 3.2.1 (using Notehub) everything works as expected. Hopefully my post will help others in the same situation.

2 Likes

@dirtdevil - thanks for pointing this out! We will update the docs accordingly as well.

Rob

Hi dirtdevel
How are you performing the fragmentation - using python split or something else?

@RobLauer - in working on this, I found a few other areas where the documentation can be improved. Specifically:

  • The documentation is ambiguous if “total” and “offset” are bytes positions in the underlying raw payload or the base64 encoded payload. Through trial and error I determined it was the former.
  • The documentation says the “status” md5sum should be base64 encoded but in practice I had to use hex encoding.
  • I was getting a response with a “payload” member that contained a base64 encoded representation of the upstream payload. This member is not documented at all.

I also ran across a few challenges while debugging that might be useful features to add to Notehub. I couldn’t find anywhere within Notehub where I could see the request received by the Web Proxy, nor the response received from the upstream web service.

In my cases, there were error conditions where the upstream replied with content that exceed 8192 bytes, so all I got on the Notecard was a message said the response was too large. It would have been very helpful to login to Notehub to see a log of the request and response between the WebProxy and the web service (similar to the very helpful Events log) . Perhaps that exists and I just I missed it.

@scjerry - in Python. Here is a helper function I made and have tested to work correctly with the latest firmware.

def web_post(card, route, payload, name=None, chunk_size=8192):
    offset = 0
    fragmented = ( len(payload) > chunk_size )
    while offset < len(payload):
        req = {"req": "web.post"}
        req["route"] = route
        if name:
            req["name"] = name
        if fragmented:
            fragment = payload[offset:offset+chunk_size]
            req["total"] = len(payload)
            req["payload"] = base64.b64encode( fragment ).decode("ascii")
            req["status"] = hashlib.md5( fragment ).hexdigest()
            req["offset"] = offset
            req["verify"] = True # not sure if this is required or not
            logging.debug("sending web.post fragment of length %s at offset %s", len(fragment), offset)
        else:
            logging.debug("sending web.post of length %s", len(payload))
            req["payload"] = base64.b64encode( payload ).decode("ascii")

        offset += chunk_size
        rsp = card.Transaction(req)
        logging.debug("web.post response %s", rsp)
        # if data remains to be transmitted we expect a 100 request
        if offset < len(payload) and rsp.get("result") != 100:
            raise RuntimeError("error in fragmented web.post")

    # Get the last response payload, if any
    response_payload = None
    if rsp.get("payload"):
            response_payload = base64.b64decode(rsp['payload'])
    return rsp, response_payload
1 Like

@dirtdevil this is more great feedback, thank you! I’ll get the docs updated and these feature requests logged for our product team to look into :+1:

@RobLauer, @dirtdevil
Thanks to @dirtdevil for the helper code. Trying to transmit jpg’s of 50-70K unsuccessfully. Using 3.2.1 notecard firmware.

Transmits all fragments (all of size 8K except last one) , but get the same single line response after last fragment in all attempts:

DEBUG:root:web.post response {‘payload’: ‘eyJkZXRhaWwiOiJKU09OIHBhcnNlIGVycm9yIC0gJ3V0Zi04JyBjb2RlYyBjYW4ndCBkZWNvZGUgYnl0ZSAweGZmIGluIHBvc2l0aW9uIDA6IGludmFsaWQgc3RhcnQgYnl0ZSIsInN0YXR1c19jb2RlIjo0MDB9’, ‘result’: 400, ‘status’: ‘184149ef45854e5dbd2d6002422014c0’}

I assume the response is base64 encoded? Again, always the same response, no matter the jpg size.

The response in your payload is:

import base64
base64.b64decode(“eyJkZXRhaWwiOiJKU09OIHBhcnNlIGVycm9yIC0gJ3V0Zi04JyBjb2RlYyBjYW4ndCBkZWNvZGUgYnl0ZSAweGZmIGluIHBvc2l0aW9uIDA6IGludmFsaWQgc3RhcnQgYnl0ZSIsInN0YXR1c19jb2RlIjo0MDB9”)
‘{“detail”:“JSON parse error - 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte”,“status_code”:400}’

This implies to me that the problem is you are using the Notehub Web Proxy with the default headers. Your upstream service is likely expecting Content-Type: image/jpeg.

In other words, the error is probably coming from the upstream web site because it’s not getting headers it’s expecting. Sadly, there is no way to set the headers in the web.post

Regards,
~Michael

@dirtdevil @scjerry - there is actually an undocumented content parameter in the web.* APIs that allows you to set the content type. For example, "content":"image/jpeg".

I’ll be adding that to the docs soon!

@dirtdevil, @RobLauer
A long haul on this. Again using Michael’s helper and Rob’s suggestions on the content type possibilities, I tried : Content-Type: 1) image/jpeg, 2) multipart/form-data, 3) application/json

None worked, with 2) giving an error: b’{“detail”:“Multipart form parse error - Invalid boundary in multipart: None”,“status_code”:400}’ after being instructed to use it.

Trying to deal with the endpoint owner. Pretty slow, for a 60K image - about 2 minutes.

@dirtdevil, @RobLauer
So, changed the endpoint to my webhook.site and got this:

No errors. DEBUG:root:web.post response {‘result’: 200}
Don’t have a clue how to interpret the raw payload.
Note accept-encoding gzip? Is that a Notehub thing?

@scjerry it seems your upstream web route wants multipart form data; but it’s not as simple as just using multipart/form-data. This is because, the content-type for multipart form data usually looks like this:

Content-Type: multipart/form-data; boundary=SomethingRandom

Here is code I use when dealing with websites that want a an upload (note, I have no idea if fileparam is the parameter name your upstream website is expecting…you’ll have to figure that out on your own…or share with us the URL you are trying to post to).

from requests_toolbelt.multipart.encoder import MultipartEncoder
    
mp_encoder = MultipartEncoder(
    fields={
        'fileparam': ("somefile.jpg", jpeg_binary_content, 'image/jpeg'),
    }
)

payload = mp_encoder.to_string()
content = mp_encoder.content_type

# pass payload and content to web_post
1 Like

Thanks again. What’s your experience for transmittal times? A 60K image takes 2 minutes, which may be a problem.

@scjerry I’m measuring about 12 seconds per 8k chunk. So a 60K image should take about 96 seconds (1 and a half minutes) by my measurements and would be consistent with your experience.

@RobLauer have there been recent changes to the NoteHub web.post response? I used to receive a result: 100 on each fragment. Now I simply get a {'total': XX} for each fragment where the XX counts up each fragment. This XX doesn’t always start at zero, so it appears that web.post is not resetting it’s state when it receives an offset: 0 in the request. It also no longer appears to be checking the m5s sum (i.e. if I replace status with bogus values it continues to accept it.

Interesting, even when I do a non-fragmented web.post I simply get a response of {'total': XX}.

Perhaps it’s something that changed on my end, but I thought I’d check here as well.

@dirtdevil This is interesting because this isn’t the behavior I’m seeing with a web.post. Would love to dig into this a bit:

  1. Which Notecard model are you using (e.g. NBGL/NBNA/WBEX/WBNA)? For file transfer scenarios on my Pi using a WBNA it’s only a few seconds to transfer an 8KB chunk (granted, will be a little slower for an NB/narrowband Notecard).

  2. Which version of the firmware are you on? I just tested on 3.2.1 and the response I see for a web.post includes only payload and result. Positive you’re doing a web.post and not a note.add with a templated notefile?

  3. If I supply a made up string to status, I get an error locally-computed MD5 ... doesn't match MD5 ... in JSON request.

@RobLauer I’m using Raspberry PI + NBGL.

{"req": "card.version"}
{"body":{"org":"Blues Wireless","product":"Notecard","version":"notecard-3.2.1","ver_major":3,"ver_minor":2,"ver_patch":1,"ver_build":13982,"built":"Feb 3 2022 15:02:36"},"version":"notecard-3.2.1.13982","device":"dev:X","name":"Blues Wireless Notecard","sku":"NOTE-NBGL-500","board":"1.11","api":3}

I’m measuring a consistent 8-12 seconds per 8kByte chunk. The same performance exists using either a RPi+Notecard or a Notecard directly connected to the computer serial port.

In the case of I2C, all of that time is in the _sendPayload which has the unfortunate behavior of sending 255 byte chunks (if max_transfer is left as the default of 0). Each chunk has a minimum time of 0.251 seconds. So with 8192Byte chunks, that’s about 11000Byte base64, which is 43 chunks, which is a total transfer time of 10.8 seconds.

All of the examples I’ve found show OpenI2C being called with max_transfer=0. I couldn’t find any guidance on what the range of appropriate values are, but if I try to change it to anything greater than 256 it throws an error because reg is a bytearray. The other alternative is to change the CARD_REQUEST_SEGMENT_DELAY_MS to a much lower value, but that would only seem feasible if the Blues team agrees that is a suitable change.

@scjerry TLDR - with the Python library the data rate is being limited by CARD_REQUEST_SEGMENT_DELAY, CARD_REQUEST_SEGMENT_MAX_LEN (SERIAL), and max_transfer (I2C)

@RobLauer are you using the Python library? If not what are you using? I’m curious how you are transferring faster than one-255 byte chunk every 0.25 second.

As for the strange change in behavior (i.e. web.post returns “total” instead of the expected “result”). I don’t have a good explanation for that yet.

I’ve tried three different notecards (two with RPi and one directly connected to the computer). All of them are using the exact same Notecard firmware and Python code.

The one directly connected to the computer serial is behaving correctly; but both the RPi devices are getting the same unexpected “total” behavior for both fragmented web.post requests and standard web.post requests. In fact, I can even specify invalid “route” values and simply get a response of {'total': X}.

Here are some diagnostics that might help you guide me to a solution:

$ notecard -port /dev/i2c-1 -interface i2c -verbose -explore
{"req":"file.changes"}
{"info":{"_web.dbx":{},"_req.qis":{}}}
    _req.qis
{"req":"note.changes","file":"_req.qis","deleted":true}
{}
    _web.dbx
{"req":"note.changes","file":"_web.dbx","deleted":true}
{"err":"json: insufficient memory {memory}"}
note.changes: json: insufficient memory {memory}


$ notecard -port /dev/i2c-1 -interface i2c -verbose -req '{"req": "web.post", "route": "SavvyAnalysis", "payload": "SGVsbG8gV29ybGQ=", "total": 11}'
{"req":"web.post","payload":"SGVsbG8gV29ybGQ=","route":"SavvyAnalysis","total":11}
{"total":1}

$ notecard -port /dev/i2c-1 -interface i2c -verbose -req '{"req": "web.post", "route": "SavvyAnalysis", "payload": "SGVsbG8gV29ybGQ="}'
{"req":"web.post","payload":"SGVsbG8gV29ybGQ=","route":"SavvyAnalysis"}
{"total":2}

On a whim I did a card.restore and that fixed the insufficient memory error. I then

$ notecard -port /dev/i2c-1 -interface i2c -verbose -explore
{"req":"file.changes"}
{}
no notefiles

$ notecard -port /dev/i2c-1 -interface i2c -verbose -req '{"req": "web.post", "route": "SavvyAnalysis", "payload": "SGVsbG8gV29ybGQ="}' 
{"req":"web.post","payload":"SGVsbG8gV29ybGQ=","route":"SavvyAnalysis"}
{"total":1}

$ notecard -port /dev/i2c-1 -interface i2c -verbose -explore
{"req":"file.changes"}
{"info":{"_web.dbx":{"total":1},"_req.qis":{}},"total":1}
    _req.qis
{"req":"note.changes","file":"_req.qis","deleted":true}
{}
    _web.dbx
{"req":"note.changes","file":"_web.dbx","deleted":true}
{"notes":{"UQ129":{"body":{"compress":2,"route":"SavvyAnalysis","method":"POST"},"payload":"CyhIZWxsbyBXb3JsZA==","time":1645909355}},"total":1}
        UQ129
            {"compress":2,"method":"POST","route":"SavvyAnalysis"}
            Payload: 13 bytes

@RobLauer I hope that these logs point to something dumb I’m doing, because otherwise I cannot explain the behavior.

@dirtdevil Can you make sure your Notecard is in continuous mode (specified in the hub.set request)? The web.* APIs are only supported in continuous mode and I suspect you are in periodic.

Yes, I was aware of that and have confirmed that multiple times. For good measure my Python code always checks the mode and, if necessary, sets the mode to continuous and then waits for a connection before proceeding with the web.post.

Things are working today even though I changed nothing in my code, the Notecard, or the route. I’ll take that as a win and assume it was either intermittent or somehow something I inadvertently did wrong. If it returns again I will provide you with more details.

Thanks for all your help!

Did you have any thoughts about the CARD_REQUEST_SEGMENT_DELAY/LEN? Can those safely be changed to increase the throughput?

Hey @dirtdevil, you can reduce the value of CARD_REQUEST_SEGMENT_DELAY to improve transfer speeds. We’ve had a request prevously to make this configurable, but I’ve not yet had a chance to implement that feature. In the meantime, feel free to tweak it yourself but beware that if you go too low, you’re going to start to step on the Notecard as its trying to respond to your previous request. You can’t safely go any lower than 25 ms and I would even give yourself a bit of buffer above that, if possible.

@bsatrom thanks for the guidance! Are there any downsides are trade-offs associated with a shorter SEGMENT_DELAY? In other words, was the default conservative for any reasons that I should be aware of?

@RobLauer the behavior has returned. Here is a command showing that I’m in ‘continous’ mode.

$ notecard -port /dev/i2c-1 -interface i2c -verbose -req '{"req": "card.restore"}'
{"req":"card.restore"}
{}

$ notecard -port /dev/i2c-1 -interface i2c -verbose -req '{"req": "hub.status"}'
{"req":"hub.status"}
{"status":"16s starting communications {wait-module} {connecting}"}

$ notecard -port /dev/i2c-1 -interface i2c -verbose -req '{"req": "hub.status"}'
{"req":"hub.status"}
{"status":"connected (session open) {connected}"}

$ notecard -port /dev/i2c-1 -interface i2c -verbose -req '{"req": "hub.get"}'
{"req":"hub.get"}
{"mode":"continuous","host":"a.notefile.net","product":"X","device":"dev:X"}

$ notecard -port /dev/i2c-1 -interface i2c -verbose -req '{"req": "web.post", "route": "SavvyAnalysis", "payload": "SGVsbG8gV29ybGQ=", "total": 11}'
{"req":"web.post","payload":"SGVsbG8gV29ybGQ=","route":"SavvyAnalysis","total":11}
{"total":1}

Things were working fine, until in the middle of a web.post I got this error:

lib/python3.9/site-packages/notecard/notecard.py", line 380, in Transaction
    raise Exception("notecard request or response was lost")

Then after that things started getting additional strange behavior.

{"req": "card.version"}
{"err":"cannot interpret JSON: unrecognized base64 at offset 6538 {io}"}

{"req": "hub.status"}
{"status":"idle (can't open session to notehub: socket open PPP: ppp connection timeout?) {disconnected} {idle}"}

Side note, as you see above there are situations where hub.status is missing the connected boolean in the response. Is this expected and thus I should also parse the “status” field itself to see if the notecard is connected?

$ notecard -port /dev/i2c-1 -interface i2c -verbose -req '{"req": "hub.status"}'
{"req":"hub.status"}
{"status":"connected {connected-closed}"}

Again! Thanks for digging into all these details with me. I love the Notecard/Notehub system and I hope that you find this helpful as well.

1 Like