Meeting minutes
<ada> https://
We're all getting a bit of a late start this morning :)
Introductions
https://github.com/immersive-web/layers/issues/265
cabanier: This seems to have been scheduled earlier but couldn't find discussion or resolution
… If you are playing a stereo video and then show the UI would like to make the video mono
… Currently no way to do that.
… Would mostly be an attribute on cylinder and quad layers, could also apply to GL layers
bajones: I remember we talked about this before but it's good to talk about again
… in layers stereo or not is set at layer creation time
… is it possible to create and swap in a non stereo video
cabanier: the issue is that the video is top-bottom so you would see that
bajones: you could then choose to render just half the content
bajones: if you are going to do it for a media you might as well do it for the others too
bajones: as a follow up, because meta has been the only one who has implemented layers what would be the implementation cost of this
cabanier: it's simple to implement
… should we come up with a name for the property?
… 'force mono'?
bajones: would it be possible to make the attribute mutable?
cabanier: not really
bajones: this would probably only used in transient situations, the app would still continue as normal but the compositor would only do half the work
cabanier: we would duplicate the left eye view to the right eye
cabanier: feels like it should just be boolean
bajones: I wonder if it's worth allowing the developer to specify how it is shown so the developer can pre-emptively optimise by not rendering a particular eye
ada: an enum would give us more freedom down the line
cabanier: but they are really annoying to spec
bajones: i don't really care too mcuh between bool and enum but enum could be useful
bajones: if it's in the spec we should definitely define which eye is prefered
ada: is there anywhere else in the spec where one eye is favoured?
cabanier: no
bialpio: We could do enum disabled/enabled then later do force left force right
bajones: (to cabanier) I am worried that doing this for projection layers that for weird displays it wouldn't work
cabanier: all but projection layers
… cylinder, quad, cube, equirect
cabanier: name?
bajones: I don't love forceMono
bajones: although it does seem fitting
bajones: maybe forceMonoView or forceMonoPresentation to inform that it's not the shape changing
https://
https://github.com/immersive-web/layers/issues/287
What is the disadvantage of creating multiple layers
It is expensive to create
bajones: why not destroy the low res layer and create a high res one?
<bajones> rik: Customer with a video playback library is streaming in multiple video resolutions
<bajones> ... Wants to select based on which comes in first
<bajones> ... Would like to swap video source mid-stream
<bajones> ... Can't be an attribute on the layer
<bajones> bajones: Why not?
ada: you establish layer from the video element, they are not changing the url of the video, right?
<bajones> rik: Because layers are currently agnostic to source, want to preserve
rik: changing the url,yes
if you change the source of the html video element, you are starting a new stream
the purpose is to download the higher resolution of the same video.
piotr: is the streaming bandwith is the issue?
bajones: starting the low res stream and waiting for the high res to switch. I am reluctant to create two steps where it could be on one.
… I would say, if there is a concrete reason why this supports the user to do something we can do more efficiently than we should do
lgombos: because you change the source the new download starts, does it really start a new download, do you know
rik: I don't know but they are getting a black frame
bajones: maybe they are doing something with start and destroy. if the issue is opaque layer and that's the reason they are getting black that might be a useful thing to communicate.
… maybe it is a signal that we dont need but I'm not sure
rik: the feedback was
bajones: I would like to hear more technical details. It might be something they can change in the userspace
… maybe the source swapping is the efficient way or we might need to give a signal
manish: html video elements already support multiple sources.
… it can also be used for resolution. I don't know how smooth that is
… to me this seems like a problem that needs to be fixed by video.
… it is a problem with video in general
… we should look into what we have right now
marcos: it is sublty different with video conditions but it will switch
will: if multiple sources video will use the first one.
… working on HLS that will be available by Q4. HSLs.js is available today
will: up to the player to decide how thsi will be implemented
rik: shakaplayer works with media layers
ada: I think this resolves this issue
https://github.com/immersive-web/layers/issues/288
ada: let's move on the next topic - Break!
<cabanier> https://
rik: this should be short: https://
… no frame available is black; can we make it transparent?
bajones: purple!
… is there a default on gl layers?
rik: I think it's transparent
bajones: surprising media layers don't do that?
rik: originally we didn't have opacity, so maybe that's why? but that's gone away
ada: so a transparent PNG on a media layer would be black?
rik: no, transparent
… opacity is a multiplier
bajones: behaviorally seems fine. are there scenarios where we can collapse transparency?
… can videos tell you if they're transparent?
rik: i don't think Chrome supports this
bajones: that's an optimization anyway. I think this is fine.
ada: it makes sense to me to start off as transparent.
ada: resolved
… next item on the agenda is about <model>
https://
ada: but let's wait on that until the scheduled time
s/
bajones: let's talk about immersive audio
… web audio is like webxr, imperative API
… WA can spatialize audio through HRTFs (head-related transfer functions)
… WA is looking at pulling in file formats with spatialized audio
… also 3DOF "hearables" - audio AR
… how do they get the data streams out of those types of devices
… lkinda like Cardboard, with no visual component
… all our APIs have video component, which has a privacy aspect to it
… it would be interesting for this case to have something like a type of session with just the tracking aspect, and no visual aspect to it
… having that kind of session would be beneficial for not only this, but other scenarios - I can't remember where atm but I've heard this request before
<Zakim> ada, you wanted to ask about declarative
ada: should be really interesting, if audio was more declarative, if you could place this in the format rather than using imperative APIs.
… omnitone apparently gets this wrong now
… that would work well for adding audio to immersive AR scenes.
rik: after meeting with audio folks, I looked in to this, and what people use - Howler.js seems to be pretty well supported.
… maybe the device orientation API would be easier? it also pops a permission prompt, might be easier.
<ada> cwilso: I just wanted to comment a couple of things, device orientation would be good enough for 3dof, audio in some areas is temporarily very strict and in others is very forgiving
<ada> cwilso: three.js has in built panner node support
<ada> cwilso: the only thing that has to happen behind the scenes is the panner node input
<bajones> https://
bajones: did a little digging, found existing issue
… on 6DOF audio-only session. I thought this was from a specific company, but not sure.
… with what Rik was saying about device orientation, the only problem is that it doesn't represent multiple devices (e.g. it doesn't track orientation of "device on my head", but just "the device")
… with devices like Airpods, it might only be a 3DOF pose, but might be 6DOF in the future
… we talked about those kind of devices in the past
… I think there's enough of an overlap, and it maps into the idea of our API
ada: who's interested in the intersection of audio and WebXR, e.g. this audio-only session?
bajones: seems like two questions: 1) integration of Web Audio and XR , e.g. wiring up a panner node
… 2) do we want types of sessions for no-visual-component sessions
… may be paths forward for both of these
… "do we want to do things to advance these"
<cabanier> +1 on making audio integration better
leonard: do we need audio-only sessions for accessibility reasons?
bajones: I don't think it's necessary; but we might want to encourage people to have better audio cues in their experiences.
… special mode is probably better for when you're dealing with hardware limitations (e.g. pair of glasses that only has audio)
ada: would that hardware-limited scenario break things today?
brandon: yes, probably
… due to exposing zero views
… it might work, but it would be fragile in how experiences are authored. for this modality of apps, you really want it to be more like an inline session that is declared as audio-only
brandel: new iOS has better HRTF details (e.g. shape of your ear). Curious to know if Panner Node supports this.
… stereopanner just does stereo panning
cwilso: web audio just has a generic use a hrtf to do this, there is a default hrtf that the useragent could replace with a better one, to do a better experience.
… no one has thought about doing that yet
<Zakim> cwilso, you wanted to note https://
cwilso: on the other two issues I would like to call out these issues about non visual sessions, which deny allow of non-visual sessions and an issue for hooking up panner nodes
ada: just wanted to mention that the 8th wall wanted trracking-only sessions to have their own visual implementation
bajones: https://
ada: let
ada: let's move on to the <model> element, with a new scribe
https://github.com/immersive-web/model-element/issues/55
introductions
<Leonard> @Ada: It's your new job
Emmett: already fair bit of discussions - interesting idea, but...
… I've been arguing that we're not yet ready to standardize, arguments already in the issue
… I think we should be talking about what problems we're trying to solve, as standardizing <model> may be A solution, but not THE solution
… main point is that right now <model> looks a lot like <model-viewer>
… so far the main advantage is that we can skip a permissions prompt that WebXR would show
… but it may not be a reason enough to go with standardizing
… especially since the API shape can get massive
… so the main question is what the goals are
marcosc: apologies if intentions not clear enough, let's rehash...
… goal is to have a simple way to include 3d models in the web pages
… commerce case really important
… AR case - it'd be cool if we didn't have to download such components twice
… accessibility story is more compelling
… API surface is going to be a challenge
Emmett: what is the delta? what does a standardized element give you that you won't get from existing options? what do you gain?
marcosc: browser renders it for you, so you don't need to download any JS
… no dependency for any JS library
… you can get new format support for free
<Leonard> 1+
Emmett: how do you get a consistent format support across different browsers, & why having a standardized element is better
… we right now have consistent rendering across browsers and we can rapidly iterate on the solution
… I don't understand how we're going to achieve that when we have different browsers w/ different schedules
marcosc: <model> tag does not preclude the solutions in JS
… browsers may be behind but over time they stabilize and catch up
… the advantage is that it's built-in into the browser, we have a baseline
ada: re feature gap of model vs model-viewer - that's not a big disadvantage, we can keep adding things to browser impl
… if at the start model doesn't work for people, they can still rely on model-viewer
alexturn: it may come down to philosophy of what the web should do
… my brain goes to: what can't you get with the current solution
… similarly w/ VR and AR browsers
… there are things you can do but you are limited to the plane of the browser
<yonet> Josh Carpenter demo: https://
alexturn: when you have model tag, we can now do things in headsets
… we may reach a point where we use models for UI elements and requesting WebXR for all little things would be an overkill
Emmett: I don't see how the dots connect between Josh's slides and the browser yet
… I'd get more interested in it if I saw how those 2 connect
… when I look at Josh's slides, I don't see a browser, it's more like a maps experience
Brandel: I've been playing w/ Apple's technology preview demos with icons
… I'd echo alexturn - it's an opportunity for the browser to do w/ the information that is privileged
… there are things that aren't safe to expose to the site
… so what is it that people want to achieve?
… we wouldn't consider using WebXR e.g. in apple.com, permission prompt is the main reason
… we can also have dedicated hardware and native libraries that'd be more efficient to use rather than JS
… it's valuable to have browser-level support
… you can do lighting estimation in immersive WebXR, but it'd be nice to do something similar mediated by the browser, without exposing the camera to the site
… with that you can see reflections on the object
<Zakim> cgw, you wanted to react to Brandel to discuss a brief chairing reminder
Brandel: <model-viewer> is good example of the use cases, but it won't be able to do the same thing as the browser
bajones: everybody's talking about Josh's slides - 2nd half goes into how this could look like in a browser
… we talked through those concepts w/ Josh
… everything you see here is far-looking, and we approached it through "what could we do through WebXR"
… so I don't think it requires the browser to be managing this
… but how can we do this without the prompt
… seems like we'd like to be able to hand off rendering to the OS components if they exist
… but the concern here is consistency
… having sat through glTF meetings, and what comes back is that we need things to render consistently everywhere in a matter that is close to real life
… what we don't want is having the model be rendered completely differently across browsers
… there will always be differences in capabilities so we may need to be able to opt in to different capabilities
… but consistency is difficult if rendering mechanism is OS-level
<alexturn> Josh Carpenter's slides: https://
bajones: it stops being a problem if you rely on JS library
… so commerce can fall back to JS simply because those use cases could then rely on rendering consistency
yonet: when we previously met, there was a lot of questions and Dean promised demos
<yonet> https://
yonet: so we could see what is an MVP
… as it may affect discussions
<Brandel> my headphones just died so I am trying to recalibrate
marcosc: demo is what we released behind a flag
Brandel: we have demos fit for public consumption I think?
<Leonard> Not everyone has Safari. Can we see something (screen share)?
Brandel: straightforwad demonstrations of what we think should be possible
<Zakim> cwilso, you wanted to discuss baseline and to discuss object
Brandel: demos tomorrow
cwilso: taking off my chair hat
… my concern is that <model> is the baseline, built into browser, but that may not be true
… as we cannot guarantee it will happen everywhere in a consistent manner
… <img> has baseline that all browsers implement
… and there are extensions
… I'm worried that if we don't have an interoperable baseline, we will fail
… point of standards is to be interoperable
… so we should not call it a web standard
marcosc: this is an incubation
cwilso: so we need to be careful how we communicate
marcosc: agree, that's why this is an incubation, that's why we're reaching out now
… we need to prove that we can render consistently
Emmett: we've gone through this as well in model-viewer since for AR on iOS we have to use QuickLook
cabanier: there are examples in Josh's slides that were explicitly a browser
… in quest browser the power is that it can be rendered in 3d
… we could do reflections and we cannot do those w/o permission prompts today
… as for consistency, we may not even be here right now as different browsers can render things inconsistently even now
Bajones: interesting where the line of sand is
… but the problem is that if one browser renders glass correctly and the other renders it as gray blob
… similarly, if one browser comes up w/ hair model and the other does not
… so there are distinctions between incorrectly representing colors and inconsistent rendering of models
Emmett: one case in point is now with how roughness gets displayed
… it's less about the colors
… glTF is what aims to solve this
… path tracers are the baselines and rasterizers should aim to be close to those
bajones: lighting being used as an input for rendering is a nice idea, but all the current devices that I've used use low res approximation of env lighting
… so for shiny models you may run into inconsistencies as well
… that indicates that we cannot hand off things to the renderer and be done
Leonard: the way this is done is presented as new tag but now we still need to figure out a lot of stuff
… the most important part in all of this is correctly rendering the 3d model, including animations
… commerce retailers are interested in non-static things being shown
… it's concerning to me that it's not addressing questions around rendering, camera, lighting...
klausw: one thing that is confusing is that what do we want to include initially
… how do we add things later
… how will the site author know what is available
… so if animation gets added later, how do we surface it to the site authors
… so it'd be good to have a process for adding features
… since there may be a long tail of capabilities
… that aren't implemented across the board
yonet: lgombos and marcosc are points of contact for the repo
marcosc: please file issues in the repo in case we didn't cover something
https://github.com/immersive-web/model-element/issues/18
bajones: this ties into about the earlier discussion about consistency
… which format model tag chooses to support
… so we should discuss this at lenght
… earlier we talked about to match the video element by having multiple src tags
… I think that model was widely seen as a mistake
… browsers like firefox ended up broken because it didn't decode all formats
… I'm worried that there's a fair amount of people that choose their platform of choice or leave out the ones of other browsers
… I think this is the most important choice
… I know that Apple prefers USDZ
… Google prefers GLTF
… we like the fact that it is well standardized
… it's proven to be easy to render in javascript and native
… and there's concern that USDZ isn't standardized at the same level
… the standard is USDZ = USD in a zip
… USD is a black box so I don't think this is an appropriate format
marcosc: thanks Brandon
… from webkit/Apple side, we like the other vendors to have strong opinions
… so if you're another vendor, please voice your preference
… as for the video, we support various formats
… but if we can agree on a format, that is great
… but it shouldn't preclude different experimentations
… maybe there's a future format which is fantastic
… the advantage of the src option, is to allow media queries
… it's well suited for various environments
… the picture and video element are used in the same way
… this is why we went with that model despite its pains
… Apple thinks USDZ is a good format but if everyone disagrees, we might need to revisit
<Zakim> cwilso, you wanted to read back Domenic's comment
cwilso: I have 2 things
… dominic mentions the video and requiring royalty free formates
… he suggests that there's a minimum bar for the format that is picked
… I don't know if we can even do that. Having an open specification is of the utmost importance
marcosc: I agree
lgombos: marcos asked for feedback, Samsung prefers GLTF
… for interop, we already discussed it quite a bit
… most of it is in the content itself which is done in another group
… so if we decide what the baseline and format is, compatibility and standardization is most important
bajones: Marcos brought up media queries
… it is not that multiple sources isn't the way to go
… that use case of media queries should be supported
… but that shouldn't extend to different formats
<ada> q/
emmet: (???) you might have the data at Apple
… but we have a convertor from gltf to USDZ
… it's very difficult
… not that many people create USDA file
… so if you have metric of how many people use that format, you will know how many people use modelviewer
Leonard: gltf supports many things
… lately gtx was added
bajones: this is more about consistency
… USDZ and GLTF both have extension methods
… and not all extensions need to be supported by a renderer
… and might not even make sense
… we need to make a consideration so users can know what features are supported by the browsers
… we need to offer user control
… maybe you have a model that has all the latest features, but maybe one UA doesn't support it in which case the author should be able to disable it
emmet: I'm unsure if anyone talked to NVidia
… they seem very interested in web and 3D format
… they are using USD as the scene formatting stuff and gltf for the format (?)
marcosc: I did read that as well
<marcosc> https://
marcosc: it's a bit buzz-worthy but I agree that it's pretty cool what they are doing
… to leonard's point about rendering consistency, we've done a good job and is getting better
… we will figure this out as we go along
… there are better use cases, and the format provide rendering hints.
bajones: this is the nvidia push
… worth mentioning that Khronos is doing a similar effort
… it's a collection of scenes and just as buzz-worthy
https://github.com/immersive-web/model-element/issues/56
Ada: Thinks CORS should be required.
Rik?: <model> should just be like <img>
Ada: If limit pollyfills to only JS, then it imposes a circular limit
Piotr: Easier to relax requirement than add it later. Propose to initially polyfill with requirement, then reduce it later
Ada: Could this be a non-normative requirement?
Note: It === CORS
???: Hard to do non-normative security requirements
Marcos: Agrees with Piotr
Ada: Does it work on video?
Marcos: No. Video is a single source. Models are not
Rik: Models are not self-contained?
Brandon: USDZ similar to GLB, pack everything into a single file, but not required
Marcos: Trying to reduce attack surface by requiring confirmation that accessing a separate server is OK
<Zakim> klausw, you wanted to react to klausw
Marcos: Originating content establishes relationships with other servers
Klaus: Control access to resources to save costs, etc.
Rik: Doesn't like the idea of preventing access
… [really more than that, but it is kind-of subtle]
Marcos: Provides explaination of what happens.
<loosing conversations and people speaking...][
<Zakim> ada, you wanted to talk about the patchwork nature of the web
Some discussion of limiting glTF to not allow secondary connections. [that would break glTF -- LD]
Ada: Wants feedback from Architecture group before reaching decision.
Marcos: Agrees
Rik: Want to make sure it is done for good reasons.
Marcos: Already gave example
<cabanier> Leonard: Rik mentioned disallowing subrequest
<cabanier> ... that would give you geometry and nothing else
<cabanier> ... and this would prohibit certain domains
<Brandel> Leonard: It's possible to set the zoom view to 'speaker' rather than 'gallery', and then pin the 'Granville' participant to get the folks in the room fullscreen
<cabanier> ada: are you saying things can be pulled from anywhere
Rik: Still likes a single-file complete model
Piotr: Concened about excessive bandwidth usage
Ada: That issue has been around since the beginning of the web
Klaus: HTTP Referrer header already can do that
Conclusion: Ada will take issue to TAG. Expects the response to "Use CORS"
https://github.com/immersive-web/model-element/issues/13
Unknown speaker: <model> looks a lot like a media element.
… It's just (a lot) more than 1-dimension (e.g., audio)
… Do all media elements need "controls"
… This is from Marcos
<ada> Leonard: this is marcosc
Brandon: Noted that media elements have many controls, spec language, and related APIs in common.
… glTF have multiple animations. How would that worlk?
Marcos: Doesn't know
<cabanier> q_
[Note] Marcos needs to leaves WG. Discussion might be
????: Media elements supports multiple tracks, but not necessarily all playable at the same time
Ada: Points out that the text track (caption) can play with audio & video
Brandel: Looking at must-haves and not-haves.
… Single animation track seems to be important
<ada> https://
ada: image tracking unconference
… I'm a fan of image tracking
… the last time we talked about it, the consensus is that it's interesting
… but different hardware platforms have different solutions
… and they don't overlap
… arcore does images well but not QR codes
… likewise hololens is good at tracking QR code but can't track plain images
… so the consensus was, if we can't ship an API across devices, should we do it all?
… the more I was thinking, in the case of HW, the use cases are different
… the hololens is tailored towards industry so QR codes make sense
… while arcore is more consumer focused
… I think they tend to support different audiences
… so it's probably not a big deal that they're different
… as a developer advocate, dom content and image tracking were the most important
… one of the things that's hard to do is shared anchors
… and the industry doesn't have a shared API
… but with qr code and image tracking, 2 users could localize to the same space
bajones: one of the things that makes this difficult is that arkit requires an image processing step upfront
… I can't find any runtime API
… arcore (??? something less complicated)
… I am concerned that image tracking requires an offline process
… if we want to have image tracking, we might have to use our own algorithm
… it's a concern that we can deliver images that can be consumed
… arkit wants non-repeating nicely defined images
ada: do we want a pile of floats shared across the platform?
bajones: it would be a path
klausw:
klausw: so yes, arcore lets you upload images at runtime
… it doesn't work with animations
… there's a subset of images that could work
… I wasn't aware of the details of arkit
<ada> to clarify you couldn't animate every magic the gathering card you are limited to 5ish images
klausw: but ada made a good point that the use case doesn't overlap
… another thing that came up is that we're providing raw camera access
… so that could be an avenue
… it's a weird API if it has unpredictable results
ada: raw camera access might give us a solution here
… for instance three.js might just build it in
… users shouldn't have to give up the farm for a basic feature
bialpio: the common use case from Nick was to detect images on curved surfaces
… this is not something we want to standardize
… so raw camera access might be needed
… the point is that it would be awesome to have image tracking across platforms
… even with that being available, that might not be enough
… should be extend the API to account for these use case
… so the simple api does something basic but more advanced cases use raw camera access
klausw: if someone goes far enough to set up physical object
… raw camera access isn't a big barrier
ada: I understand where Nick's example comes from
… but we don't want webxr to always ask for camera access so people just always give it out
… it's good that people stay cautious
klausw: we do have an implementation in chrome of the draft spec
… and it's ready to go if this is what people want
… are people ok with making this a standard?
… or should it be completely different
ada: I'd love to go forward with it
… as bajones said, people may encounter problems based on the limitations of ARKit
yonet: do we need another contact for marker tracking
ada: does anyone else want to be a contact?
(Rik Cabanier) volunteers
<yonet> WebRTC meeting zoom information is here: https://