f11418934b
Summary: - 'POSSIBLE FOR TWO CPUs TO HOLD A CRITICAL SECTION' was resolved Signed-off-by: Masayuki Ishikawa <Masayuki.Ishikawa@jp.sony.com>
2584 lines
132 KiB
Text
2584 lines
132 KiB
Text
NuttX TODO List (Last updated November 20, 2020)
|
||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||
|
||
This file summarizes known NuttX bugs, limitations, inconsistencies with
|
||
standards, things that could be improved, and ideas for enhancements. This
|
||
TODO list does not include issues associated with individual board ports. See
|
||
also the individual README.txt files in the boards/ sub-directories for
|
||
issues related to each board port.
|
||
|
||
nuttx/:
|
||
|
||
(16) Task/Scheduler (sched/)
|
||
(2) SMP
|
||
(1) Memory Management (mm/)
|
||
(0) Power Management (drivers/pm)
|
||
(5) Signals (sched/signal, arch/)
|
||
(2) pthreads (sched/pthread, libs/libc/pthread)
|
||
(0) Message Queues (sched/mqueue)
|
||
(1) Work Queues (sched/wqueue)
|
||
(6) Kernel/Protected Build
|
||
(2) C++ Support
|
||
(5) Binary loaders (binfmt/)
|
||
(17) Network (net/, drivers/net)
|
||
(4) USB (drivers/usbdev, drivers/usbhost)
|
||
(2) Other drivers (drivers/)
|
||
(9) Libraries (libs/libc/, libs/libm/)
|
||
(12) File system/Generic drivers (fs/, drivers/)
|
||
(10) Graphics Subsystem (graphics/)
|
||
(1) Build system / Toolchains
|
||
(2) Linux/Cygwin simulation (arch/sim)
|
||
(5) ARM (arch/arm/)
|
||
|
||
apps/ and other Add-Ons:
|
||
|
||
(1) Network Utilities (apps/netutils/)
|
||
(1) NuttShell (NSH) (apps/nshlib)
|
||
(2) System libraries apps/system (apps/system)
|
||
(1) Modbus (apps/modbus)
|
||
(5) Other Applications & Tests (apps/examples/)
|
||
|
||
o Task/Scheduler (sched/)
|
||
^^^^^^^^^^^^^^^^^^^^^^^
|
||
|
||
Title: CHILD PTHREAD TERMINATION
|
||
Description: When a tasks exits, shouldn't all of its child pthreads also be
|
||
terminated?
|
||
|
||
This behavior was implemented as an options controlled by the
|
||
configuration setting CONFIG_SCHED_EXIT_KILL_CHILDREN. This
|
||
option must be used with caution, however. It should not be
|
||
used unless you are certain of what you are doing. Uninformed
|
||
of this option can often lead to memory leaks since, for
|
||
example, memory allocations held by threads are not
|
||
automatically freed!
|
||
|
||
Status: Closed. No, this behavior will not be implemented unless
|
||
specifically selected.
|
||
Priority: Medium, required for good emulation of process/pthread model.
|
||
The current behavior allows for the main thread of a task to
|
||
exit() and any child pthreads will persist. That does raise
|
||
some issues: The main thread is treated much like just-another-
|
||
pthread but must follow the semantics of a task or a process.
|
||
That results in some inconsistencies (for example, with robust
|
||
mutexes, what should happen if the main thread exits while
|
||
holding a mutex?)
|
||
|
||
Title: pause() NON-COMPLIANCE
|
||
Description: In the POSIX description of this function the pause() function
|
||
must suspend the calling thread until delivery of a signal whose
|
||
action is either to execute a signal-catching function or to
|
||
terminate the process. The current implementation only waits for
|
||
any non-blocked signal to be received. It should only wake up if
|
||
the signal is delivered to a handler.
|
||
Status: Open.
|
||
Priority: Medium Low.
|
||
|
||
Title: ON-DEMAND PAGING INCOMPLETE
|
||
Description: On-demand paging has recently been incorporated into the RTOS.
|
||
The design of this feature is described here:
|
||
https://nuttx.apache.org/docs/latest/components/paging.html.
|
||
As of this writing, the basic feature implementation is
|
||
complete and much of the logic has been verified. The test
|
||
harness for the feature exists only for the NXP LPC3131 (see
|
||
boards/arm/lpc31xx/ea3131/configs/pgnsh and locked
|
||
directories). There are some limitations of this testing so
|
||
I still cannot say that the feature is fully functional.
|
||
Status: Open. This has been put on the shelf for some time.
|
||
Priority: Medium-Low
|
||
|
||
Title: GET_ENVIRON_PTR()
|
||
Description: get_environ_ptr() (sched/sched_getenvironptr.c) is not implemented.
|
||
The representation of the environment strings selected for
|
||
NuttX is not compatible with the operation. Some significant
|
||
re-design would be required to implement this function and that
|
||
effort is thought to be not worth the result.
|
||
Status: Open. No change is planned.
|
||
Priority: Low -- There is no plan to implement this.
|
||
|
||
Title: TIMER_GETOVERRUN()
|
||
Description: timer_getoverrun() (sched/timer_getoverrun.c) is not implemented.
|
||
Status: Open
|
||
Priority: Low -- There is no plan to implement this.
|
||
|
||
Title: INCOMPATIBILITIES WITH execv() AND execl()
|
||
Description: Simplified 'execl()' and 'execv()' functions are provided by
|
||
NuttX. NuttX does not support processes and hence the concept
|
||
of overlaying a tasks process image with a new process image
|
||
does not make any sense. In NuttX, these functions are
|
||
wrapper functions that:
|
||
|
||
1. Call the non-standard binfmt function 'exec', and then
|
||
2. exit(0).
|
||
|
||
As a result, the current implementations of 'execl()' and
|
||
'execv()' suffer from some incompatibilities, the most
|
||
serious of these is that the exec'ed task will not have
|
||
the same task ID as the vfork'ed function. So the parent
|
||
function cannot know the ID of the exec'ed task.
|
||
Status: Open
|
||
Priority: Medium Low for now
|
||
|
||
Title: ISSUES WITH atexit(), on_exit(), AND pthread_cleanup_pop()
|
||
Description: These functions execute with the following bad properties:
|
||
|
||
1. They run with interrupts disabled,
|
||
2. They run in supervisor mode (if applicable), and
|
||
3. They do not obey any setup of PIC or address
|
||
environments. Do they need to?
|
||
4. In the case of task_delete() and pthread_cancel() without
|
||
deferred cancellation, these callbacks will run on the
|
||
thread of execution and address context of the caller of
|
||
task_delete() or pthread_cancel(). That is very bad!
|
||
|
||
The fix for all of these issues it to have the callbacks
|
||
run on the caller's thread as is currently done with
|
||
signal handlers. Signals are delivered differently in
|
||
PROTECTED and KERNEL modes: The delivery involves a
|
||
signal handling trampoline function in the user address
|
||
space and two signal handlers: One to call the signal
|
||
handler trampoline in user mode (SYS_signal_handler) and
|
||
on in with the signal handler trampoline to return to
|
||
supervisor mode (SYS_signal_handler_return)
|
||
|
||
The primary difference is in the location of the signal
|
||
handling trampoline:
|
||
|
||
- In PROTECTED mode, there is on a single user space blob
|
||
with a header at the beginning of the block (at a well-
|
||
known location. There is a pointer to the signal handler
|
||
trampoline function in that header.
|
||
- In the KERNEL mode, a special process signal handler
|
||
trampoline is used at a well-known location in every
|
||
process address space (ARCH_DATA_RESERVE->ar_sigtramp).
|
||
Status: Open
|
||
Priority: Medium Low. This is an important change to some less
|
||
important interfaces. For the average user, these
|
||
functions are just fine the way they are.
|
||
|
||
Title: execv() AND vfork()
|
||
Description: There is a problem when vfork() calls execv() (or execl()) to
|
||
start a new application: When the parent thread calls vfork()
|
||
it receives and gets the pid of the vforked task, and *not*
|
||
the pid of the desired execv'ed application.
|
||
|
||
The same tasking arrangement is used by the standard function
|
||
posix_spawn(). However, posix_spawn uses the non-standard, internal
|
||
NuttX interface task_reparent() to replace the child's parent task
|
||
with the caller of posix_spawn(). That cannot be done with vfork()
|
||
because we don't know what vfork() is going to do.
|
||
|
||
Any solution to this is either very difficult or impossible without
|
||
an MMU.
|
||
Status: Open
|
||
Priority: Low (it might as well be low since it isn't going to be fixed).
|
||
|
||
Title: errno IS NOT SHARED AMONG THREADS
|
||
Description: In NuttX, the errno value is unique for each thread. But for
|
||
bug-for-bug compatibility, the same errno should be shared by
|
||
the task and each thread that it creates. It is *very* easy
|
||
to make this change: Just move the tls_errno field from
|
||
struct tls_info_s to struct task_group_s. However, I am still
|
||
not sure if this should be done or not.
|
||
NOTE: glibc behaves this way unless __thread is defined then,
|
||
in that case, it behaves like NuttX (using TLS to save the
|
||
thread local errno).
|
||
Status: Closed. The existing solution is better and compatible with
|
||
thread-aware GLIBC (although its incompatibilities could show
|
||
up in porting some code). I will retain this issue for
|
||
reference only.
|
||
Priority: N/A
|
||
|
||
Title: SCALABILITY
|
||
Description: Task control information is retained in simple lists. This
|
||
is completely appropriate for small embedded systems where
|
||
the number of tasks, N, is relatively small. Most list
|
||
operations are O(N). This could become an issue if N gets
|
||
very large.
|
||
|
||
In that case, these simple lists should be replaced with
|
||
something more performant such as a balanced tree in the
|
||
case of ordered lists. Fortunately, most internal lists are
|
||
hidden behind simple accessor functions and so the internal
|
||
data structures can be changed if need with very little impact.
|
||
|
||
Explicitly reference to the list structure are hidden behind
|
||
the macro this_task().
|
||
|
||
Status: Open
|
||
Priority: Low. Things are just the way that we want them for the way
|
||
that NuttX is used today.
|
||
|
||
Title: INTERNAL VERSIONS OF USER FUNCTIONS
|
||
Description: The internal NuttX logic uses the same interfaces as does
|
||
the application. That sometime produces a problem because
|
||
there is "overloaded" functionality in those user interfaces
|
||
that are not desirable.
|
||
|
||
For example, having cancellation points hidden inside of the
|
||
OS can cause non-cancellation point interfaces to behave
|
||
strangely.
|
||
|
||
Here is another issue: Internal OS functions should not set
|
||
errno and should never have to look at the errno value to
|
||
determine the cause of the failure. The errno is provided
|
||
for compatibility with POSIX application interface
|
||
requirements and really doesn't need to be used within the
|
||
OS.
|
||
|
||
Both of these could be fixed if there were special internal
|
||
versions these functions. For example, there could be a an
|
||
nxsem_wait() that does all of the same things as sem_wait()
|
||
was does not create a cancellation point and does not set
|
||
the errno value on failures.
|
||
|
||
Everything inside the OS would use nx_sem_wait().
|
||
Applications would call sem_wait() which would just be a
|
||
wrapper around nx_sem_wait() that adds the cancellation point
|
||
and that sets the errno value on failures.
|
||
|
||
On particularly difficult issue is the use of common memory
|
||
manager C, and NX libraries in the build. For the PROTECTED
|
||
and KERNEL builds, this issue is resolved. In that case,
|
||
The OS links with a different version of the libraries than
|
||
does the application: The OS version would use the OS internal
|
||
interfaces and the application would use the standard
|
||
interfaces.
|
||
|
||
But for the FLAT build, both the OS and the applications use
|
||
the same library functions. For applications, the library
|
||
functions *must* support errno's and cancellation and, hence,
|
||
these are also used within the OS.
|
||
|
||
But that raises yet another issue: If the application
|
||
version of the libraries use the standard interfaces
|
||
internally, then they may generate unexpected cancellation
|
||
points. For example, the memory management would take a
|
||
semaphore using sem_wait() to get exclusive access to the
|
||
heap. That means that every call to malloc() and free()
|
||
would be a cancellation point, a clear POSIX violation.
|
||
|
||
Changes like that could clean up some of this internal
|
||
craziness.
|
||
|
||
UPDATE:
|
||
2017-10-03: This change has been completed for the case of
|
||
semaphores used in the OS. Still need to checkout signals
|
||
and messages queues that are also used in the OS. Also
|
||
backed out commit b4747286b19d3b15193b2a5e8a0fe48fa0a8638c.
|
||
2017-10-06: This change has been completed for the case of
|
||
signals used in the OS. Still need to checkout messages
|
||
queues that are also used in the OS.
|
||
2017-10-10: This change has been completed for the case of
|
||
message queue used in the OS. I am keeping this issue
|
||
open because (1) there are some known remaining calls that
|
||
that will modify the errno (such as dup(), dup2(),
|
||
nxtask_activate(), kthread_create(), exec(), mq_open(),
|
||
mq_close(), and others) and (2) there may still be calls that
|
||
create cancellation points. Need to check things like open(),
|
||
close(), read(), write(), and possibly others.
|
||
2018-01-30: This change has been completed for the case of
|
||
scheduler functions used within the OS: sched_getparam(),
|
||
sched_setparam(), sched_getscheduler(), sched_setschedule(),
|
||
and sched_setaffinity(),
|
||
2018-09-15: This change has been completed for the case of
|
||
open() used within the OS. There are places under libs/ and
|
||
boards/ that have not been converted. I also note cases
|
||
where fopen() is called under libs/libc/netdb/.
|
||
2019-09-11: built_isavail() no longer sets the errno variable.
|
||
|
||
Status: Open
|
||
Priority: Low. Things are working OK the way they are. But the design
|
||
could be improved and made a little more efficient with this
|
||
change.
|
||
|
||
Task: IDLE THREAD TCB SETUP
|
||
Description: There are issues with setting IDLE thread stacks:
|
||
|
||
The problem is colorizing that stack to use with stack usage
|
||
monitoring logic. There is logic in some start functions to
|
||
do this in a function called go_nx_start.
|
||
It is available in these architectures:
|
||
|
||
./arm/src/efm32/efm32_start.c:static void go_nx_start(void *pv, unsigned int nbytes)
|
||
./arm/src/kinetis/kinetis_start.c:static void go_nx_start(void *pv, unsigned int nbytes)
|
||
./arm/src/sam34/sam_start.c:static void go_nx_start(void *pv, unsigned int nbytes)
|
||
./arm/src/samv7/sam_start.c:static void go_nx_start(void *pv, unsigned int nbytes)
|
||
./arm/src/stm32/stm32_start.c:static void go_nx_start(void *pv, unsigned int nbytes)
|
||
./arm/src/stm32f7/stm32_start.c:static void go_nx_start(void *pv, unsigned int nbytes)
|
||
./arm/src/stm32l4/stm32l4_start.c:static void go_nx_start(void *pv, unsigned int nbytes)
|
||
./arm/src/tms570/tms570_boot.c:static void go_nx_start(void *pv, unsigned int nbytes)
|
||
./arm/src/xmc4/xmc4_start.c:static void go_nx_start(void *pv, unsigned int nbytes)
|
||
|
||
But no others.
|
||
Status: Open
|
||
Priority: Low, only needed for more complete debug.
|
||
|
||
Title: PRIORITY INHERITANCE WITH SPORADIC SCHEDULER
|
||
Description: The sporadic scheduler manages CPU utilization by a task by
|
||
alternating between a high and a low priority. In either
|
||
state, it may have its priority boosted. However, under
|
||
some circumstances, it is impossible in the current design to
|
||
switch to the correct priority if a semaphore held by the
|
||
sporadic thread is participating in priority inheritance:
|
||
|
||
There is an issue when switching from the high to the low
|
||
priority state. If the priority was NOT boosted above the
|
||
higher priority, it still may still need to boosted with
|
||
respect to the lower priority. If the highest priority
|
||
thread waiting on a semaphore held by the sporadic thread is
|
||
higher in priority than the low priority but less than the
|
||
higher priority, then new thread priority should be set to
|
||
that middle priority, not to the lower priority.
|
||
|
||
In order to do this we would need to know the highest
|
||
priority from among all tasks waiting for the all semaphores
|
||
held by the sporadic task. That information could be
|
||
retained by the priority inheritance logic for use by the
|
||
sporadic scheduler. The boost priority could be retained in
|
||
a new field of the TCB (say, pend_priority). That
|
||
pend_priority could then be used when switching from the
|
||
higher to the lower priority.
|
||
Status: Open
|
||
Priority: Low. Does anyone actually use the sporadic scheduler?
|
||
|
||
Title: SIMPLIFY SPORADIC SCHEDULER DESIGN
|
||
Description: I have been planning to re-implement sporadic scheduling for
|
||
some time. I believe that the current implementation is
|
||
unnecessarily complex. There is no clear statement for the
|
||
requirements of sporadic scheduling that I could find, so I
|
||
based the design on some behaviors of another OS that I saw
|
||
published (QNX as I recall).
|
||
|
||
But I think that the bottom line requirement for sporadic
|
||
scheduling is that is it should make a best attempt to
|
||
control a fixed percentage of CPU bandwidth for a task in
|
||
during an interval only by modifying it is priority between
|
||
a low and a high priority. The current design involves
|
||
several timers: A "budget" timer plus a variable number of
|
||
"replenishment" timers and a lot of nonsense to duplicate QNX
|
||
behavior that I think I not necessary.
|
||
|
||
It think that the sporadic scheduler could be re-implemented
|
||
with only the single "budget" timer. Instead of starting a
|
||
new "replenishment" timer when the task is resumed, that
|
||
single timer could just be extended.
|
||
Status: Open
|
||
Priority: Low. This is an enhancement. And does anyone actually use
|
||
the sporadic scheduler?
|
||
|
||
Title: REMOVE NESTED CANCELLATION POINT SUPPORT
|
||
Description: The current implementation support nested cancellation points.
|
||
The TCB field cpcount keeps track of that nesting level.
|
||
However, cancellation points should not be calling other
|
||
cancellation points so this design could be simplified by
|
||
removing all support for nested cancellation points.
|
||
Status: Open
|
||
Priority: Low. No harm is being done by the current implementation.
|
||
This change is primarily for aesthetic reasons. If would
|
||
reduce memory usage by a very small but probably
|
||
insignificant amount.
|
||
|
||
Title: DAEMONIZE ELF PROGRAM
|
||
Description: It is a common practice to "daemonize" to detach a task from
|
||
its parent. This is used with NSH, for example, so that NSH
|
||
will not stall, waiting in waitpid() for the child task to
|
||
exit.
|
||
|
||
Daemonization is done to creating a new task which continues
|
||
to run while the original task exits (sending the SIGCHLD
|
||
signal to the parent and awakening waitpid()). In a pure
|
||
POSIX system, this is down with fork(), perhaps like:
|
||
|
||
if (fork() != 0)
|
||
{
|
||
exit();
|
||
}
|
||
|
||
but is usually done with task_create() in NuttX. But when
|
||
task_create() is called from within an ELF program, a very
|
||
perverse situation is created:
|
||
|
||
The basic problem involves address environments and task groups:
|
||
"Task groups" are emulations of Linux processes. For the
|
||
case of the FLAT, ELF module, the address environment is
|
||
allocated memory that contains the ELF module.
|
||
|
||
When you call task_create() from the ELF program, you now
|
||
have two task groups running in the same address environment.
|
||
That is a perverse situation for which there is no standard
|
||
solution. There is nothing comparable to that. Even in
|
||
Linux, fork() creates another address environment (although
|
||
it is an exact copy of the original).
|
||
|
||
When the ELF program was created, the function exec() in
|
||
binfmt/binfmt_exec.c runs. It sets up a call back that will
|
||
be invoked when the ELF program exits.
|
||
|
||
When ELF program exits, the address environment is destroyed
|
||
and the other task running in the same address environment is
|
||
then running in stale memory and will eventually crash.
|
||
|
||
Nothing special happens when the other created task running
|
||
in the allocated address environment exits since has no such
|
||
call backs.
|
||
|
||
In order to make this work you would need logic like:
|
||
|
||
1. When the ELF task calls task_create(), it would need to:
|
||
|
||
a. Detect that task_create() was called from an ELF program,
|
||
b. increment a reference count on the address environment, and
|
||
c. Set up the same exit hook for the newly created task.
|
||
|
||
2. Then when either the ELF program task or the created task
|
||
in the same address environment exits, it would decrement
|
||
the reference count. When the last task exits, the reference
|
||
count would go to zero and the address environment could be
|
||
destroyed.
|
||
|
||
This is complex work and would take some effort and probably
|
||
requires redesign of existing code and interfaces to get a
|
||
proper, clean, modular solution.
|
||
|
||
Status: Open
|
||
Priority: Medium-Low. A simple work-arounds when using NSH is to use
|
||
the '&' postfix to put the started ELF program into background.
|
||
|
||
o SMP
|
||
^^^
|
||
|
||
Title: MISUSE OF sched_lock() IN SMP MODE
|
||
Description: The OS API sched_lock() disables pre-emption and locks a
|
||
task in place. In the single CPU case, it is also often
|
||
used to enforce a simple critical section since not other
|
||
task can run while pre-emption is locked.
|
||
|
||
This, however, does not generalize to the SMP case. In the
|
||
SMP case, there are multiple tasks running on multiple CPUs.
|
||
The basic behavior is still correct: The task that has
|
||
locked pre-emption will not be suspended. However, there
|
||
is no longer any protection for use as a critical section:
|
||
tasks running on other CPUs may still execute that
|
||
unprotected code region.
|
||
|
||
The solution is to replace the use of sched_lock() with
|
||
stronger protection such as spin_lock_irqsave().
|
||
Status: Open
|
||
Priority: Medium for SMP system. Not critical to single CPU systems.
|
||
NOTE: There are no known bugs from this potential problem.
|
||
|
||
Title: ISSUES WITH ACCESSING CPU INDEX
|
||
Description: The CPU number is accessed usually with the macro this_cpu().
|
||
The returned CPU number is then used for various things,
|
||
typically as an array index. However, if pre-emption is
|
||
not disabled,then it is possible that a context switch
|
||
could occur and that logic could run on another CPU with
|
||
possible fatal consequences.
|
||
|
||
We need to evaluate all use of this_cpu() and assure that
|
||
it is used in a way that guarantees the the code continues
|
||
to execute on the same CPU.
|
||
|
||
Status: Open
|
||
Prioity: Medium. This is a logical problem but I have never seen
|
||
an bugs caused by this. But I believe that failures are
|
||
possible.
|
||
|
||
o Memory Management (mm/)
|
||
^^^^^^^^^^^^^^^^^^^^^^^
|
||
|
||
Title: FREE MEMORY ON TASK EXIT
|
||
Description: Add an option to free all memory allocated by a task when the
|
||
task exits. This is probably not be worth the overhead for a
|
||
deeply embedded system.
|
||
|
||
There would be complexities with this implementation as well
|
||
because often one task allocates memory and then passes the
|
||
memory to another: The task that "owns" the memory may not
|
||
be the same as the task that allocated the memory.
|
||
|
||
Update. From the NuttX forum:
|
||
...there is a good reason why task A should never delete task B.
|
||
That is because you will strand memory resources. Another feature
|
||
lacking in most flat address space RTOSs is automatic memory
|
||
clean-up when a task exits.
|
||
|
||
That behavior just comes for free in a process-based OS like Linux:
|
||
Each process has its own heap and when you tear down the process
|
||
environment, you naturally destroy the heap too.
|
||
|
||
But RTOSs have only a single, shared heap. I have spent some time
|
||
thinking about how you could clean up memory required by a task
|
||
when a task exits. It is not so simple. It is not as simple as
|
||
just keeping memory allocated by a thread in a list then freeing
|
||
the list of allocations when the task exists.
|
||
|
||
It is not that simple because you don't know how the memory is
|
||
being used. For example, if task A allocates memory that is used
|
||
by task B, then when task A exits, you would not want to free that
|
||
memory needed by task B. In a process-based system, you would
|
||
have to explicitly map shared memory (with reference counting) in
|
||
order to share memory. So the life of shared memory in that
|
||
environment is easily managed.
|
||
|
||
I have thought that the way that this could be solved in NuttX
|
||
would be: (1) add links and reference counts to all memory allocated
|
||
by a thread. This would increase the memory allocation overhead!
|
||
(2) Keep the list head in the TCB, and (3) extend mmap() and munmap()
|
||
to include the shared memory operations (which would only manage
|
||
the reference counting and the life of the allocation).
|
||
|
||
Then what about pthreads? Memory should not be freed until the last
|
||
pthread in the group exists. That could be done with an additional
|
||
reference count on the whole allocated memory list (just as streams
|
||
and file descriptors are now shared and persist until the last
|
||
pthread exits).
|
||
|
||
I think that would work but to me is very unattractive and
|
||
inconsistent with the NuttX "small footprint" objective. ...
|
||
|
||
Other issues:
|
||
- Memory free time would go up because you would have to remove
|
||
the memory from that list in free().
|
||
- There are special cases inside the RTOS itself. For example,
|
||
if task A creates task B, then initial memory allocations for
|
||
task B are created by task A. Some special allocators would
|
||
be required to keep this memory on the correct list (or on
|
||
no list at all).
|
||
|
||
Updated 2016-06-25:
|
||
For processors with an MMU (Memory Management Unit), NuttX can be
|
||
built in a kernel mode. In that case, each process will have a
|
||
local copy of its heap (filled with sbrk()) and when the process
|
||
exits, its local heap will be destroyed and the underlying page
|
||
memory is recovered.
|
||
|
||
So in this case, NuttX work just link Linux or or *nix systems:
|
||
All memory allocated by processes or threads in processes will
|
||
be recovered when the process exits.
|
||
|
||
But not for the flat memory build. In that case, the issues
|
||
above do apply. There is no safe way to recover the memory in
|
||
that case (and even if there were, the additional overhead would
|
||
not be acceptable on most platforms).
|
||
|
||
This does not prohibit anyone from creating a wrapper for malloc()
|
||
and an atexit() callback that frees memory on task exit. People
|
||
are free and, in fact, encouraged, to do that. However, since
|
||
it is inherently unsafe, I would never incorporate anything
|
||
like that into NuttX.
|
||
|
||
Status: Open. No changes are planned. NOTE: This applies to the FLAT
|
||
and PROTECTED builds only. There is no such leaking of memory
|
||
in the KERNEL build mode.
|
||
Priority: Medium/Low, a good feature to prevent memory leaks but would
|
||
have negative impact on memory usage and code size.
|
||
|
||
o Power Management (drivers/pm)
|
||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||
|
||
o Signals (sched/signal, arch/)
|
||
^^^^^^^^^^^^^^^^^^^^^^^
|
||
|
||
Title: STANDARD SIGNALS
|
||
Description: 'Standard' signals and signal actions are not fully
|
||
supported. The SIGCHLD signal is supported and, if the
|
||
option CONFIG_SIG_DEFAULT=y is included, some signals will
|
||
perform their default actions (dependent upon addition
|
||
configuration settings):
|
||
|
||
Signal Action Additional Configuration
|
||
------- -------------------- -------------------------
|
||
SIGUSR1 Abnormal Termination CONFIG_SIG_SIGUSR1_ACTION
|
||
SIGUSR2 Abnormal Termination CONFIG_SIG_SIGUSR2_ACTION
|
||
SIGALRM Abnormal Termination CONFIG_SIG_SIGALRM_ACTION
|
||
SIGPOLL Abnormal Termination CONFIG_SIG_SIGPOLL_ACTION
|
||
SIGSTOP Suspend task CONFIG_SIG_SIGSTOP_ACTION
|
||
SIGSTP Suspend task CONFIG_SIG_SIGSTOP_ACTION
|
||
SIGCONT Resume task CONFIG_SIG_SIGSTOP_ACTION
|
||
SIGINT Abnormal Termination CONFIG_SIG_SIGKILL_ACTION
|
||
SIGKILL Abnormal Termination CONFIG_SIG_SIGKILL_ACTION
|
||
|
||
Status: Open. No further changes are planned.
|
||
Priority: Low, required by standards but not so critical for an
|
||
embedded system.
|
||
|
||
Title: SIGEV_THREAD
|
||
Description: Implementation of support for SIGEV_THREAD is available
|
||
only in the FLAT build mode because it uses the OS work queues to
|
||
perform the callback. The alternative for the PROTECTED and KERNEL
|
||
builds would be to create pthreads in the user space to perform the
|
||
callbacks. That is not a very attractive solution due to performance
|
||
issues. It would also require some additional logic to specify the
|
||
TCB of the parent so that the pthread could be bound to the correct
|
||
group.
|
||
|
||
There is also some user-space logic in libs/libc/aio/lio_listio.c.
|
||
That logic could use the user-space work queue for the callbacks.
|
||
Status: Low, there are alternative designs. However, these features
|
||
are required by the POSIX standard.
|
||
Priority: Low for now
|
||
|
||
Title: SIGNAL NUMBERING
|
||
Description: In signal.h, the range of valid signals is listed as 0-31. However,
|
||
in many interfaces, 0 is not a valid signal number. The valid
|
||
signal number should be 1-32. The signal set operations would need
|
||
to map bits appropriately.
|
||
Status: Open
|
||
Priority: Low. Even if there are only 31 usable signals, that is still a lot.
|
||
|
||
Title: NO QUEUING of SIGNAL ACTIONS
|
||
Description: In the architecture specific implementation of struct xcptcontext,
|
||
there are fields used by signal handling logic to pass the state
|
||
information needed to dispatch signal actions to the appropriate
|
||
handler.
|
||
|
||
There is only one copy of this state information in the
|
||
implementations of struct xcptcontext and, as a consequence,
|
||
if there is a signal handler executing on a thread, then addition
|
||
signal actions will be lost until that signal handler completes
|
||
and releases those resources.
|
||
Status: Open
|
||
Priority: Low. This design flaw has been around for ages and no one has yet
|
||
complained about it. Apparently the visibility of the problem is
|
||
very low.
|
||
|
||
Title: QUEUED SIGNAL ACTIONS ARE INAPPROPRIATELY DEFERRED
|
||
Description: The implement of nxsig_deliver() does the following in a loop:
|
||
- It takes the next next queued signal action from a list
|
||
- Calls the architecture-specific up_sigdeliver() to perform
|
||
the signal action (through some sleight of hand in
|
||
up_schedule_sigaction())
|
||
- up_sigdeliver() is a trampoline function that performs the
|
||
actual signal action as well as some housekeeping functions
|
||
then
|
||
- up_sigdeliver() performs a context switch back to the normal,
|
||
uninterrupted thread instead of returning to nxsig_deliver().
|
||
|
||
The loop in nxsig_deliver() then will have the opportunity to
|
||
run until when that normal, uninterrupted thread is suspended.
|
||
Then the loop will continue with the next queued signal
|
||
action.
|
||
|
||
Normally signals execute immediately. The is the whole reason
|
||
why almost all blocking APIs return when a signal is received
|
||
(with errno equal to EINTR).
|
||
Status: Open
|
||
Priority: Low. This design flaw has been around for ages and no one has yet
|
||
complained about it. Apparently the visibility of the problem is
|
||
very low.
|
||
|
||
o pthreads (sched/pthreads libs/libc/pthread)
|
||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||
|
||
Title: PTHREAD_PRIO_PROTECT
|
||
Description: Extend pthread_mutexattr_setprotocol(). It should support
|
||
PTHREAD_PRIO_PROTECT (and so should its non-standard counterpart
|
||
sem_setproto()).
|
||
|
||
"When a thread owns one or more mutexes initialized with the
|
||
PTHREAD_PRIO_PROTECT protocol, it shall execute at the higher of its
|
||
priority or the highest of the priority ceilings of all the mutexes
|
||
owned by this thread and initialized with this attribute, regardless of
|
||
whether other threads are blocked on any of these mutexes or not.
|
||
|
||
"While a thread is holding a mutex which has been initialized with
|
||
the PTHREAD_PRIO_INHERIT or PTHREAD_PRIO_PROTECT protocol attributes,
|
||
it shall not be subject to being moved to the tail of the scheduling queue
|
||
at its priority in the event that its original priority is changed,
|
||
such as by a call to sched_setparam(). Likewise, when a thread unlocks
|
||
a mutex that has been initialized with the PTHREAD_PRIO_INHERIT or
|
||
PTHREAD_PRIO_PROTECT protocol attributes, it shall not be subject to
|
||
being moved to the tail of the scheduling queue at its priority in the
|
||
event that its original priority is changed."
|
||
|
||
Status: Open. No changes planned.
|
||
Priority: Low -- about zero, probably not that useful. Priority inheritance is
|
||
already supported and is a much better solution. And it turns out
|
||
that priority protection is just about as complex as priority inheritance.
|
||
Excerpted from my post in a Linked-In discussion:
|
||
|
||
"I started to implement this HLS/"PCP" semaphore in an RTOS that I
|
||
work with (https://apache.nuttx.org) and I discovered after doing the
|
||
analysis and basic code framework that a complete solution for the
|
||
case of a counting semaphore is still quite complex -- essentially
|
||
as complex as is priority inheritance.
|
||
|
||
"For example, suppose that a thread takes 3 different HLS semaphores
|
||
A, B, and C. Suppose that they are prioritized in that order with
|
||
A the lowest and C the highest. Suppose the thread takes 5 counts
|
||
from A, 3 counts from B, and 2 counts from C. What priority should
|
||
it run at? It would have to run at the priority of the highest
|
||
priority semaphore C. This means that the RTOS must maintain
|
||
internal information of the priority of every semaphore held by
|
||
the thread.
|
||
|
||
"Now suppose it releases one count on semaphore B. How does the
|
||
RTOS know that it still holds 2 counts on B? With some complex
|
||
internal data structure. The RTOS would have to maintain internal
|
||
information about how many counts from each semaphore are held
|
||
by each thread.
|
||
|
||
"How does the RTOS know that it should not decrement the priority
|
||
from the priority of C? Again, only with internal complexity. It
|
||
would have to know the priority of every semaphore held by
|
||
every thread.
|
||
|
||
"Providing the HLS capability on a simple pthread mutex would not
|
||
be such quite such a complex job if you allow only one mutex per
|
||
thread. However, the more general case seems almost as complex
|
||
as priority inheritance. I decided that the implementation does
|
||
not have value to me. I only wanted it for its reduced
|
||
complexity; in all other ways I believe that it is the inferior
|
||
solution. So I discarded a few hours of programming. Not a
|
||
big loss from the experience I gained."
|
||
|
||
Title: INAPPROPRIATE USE OF sched_lock() BY pthreads
|
||
Description: In implementation of standard pthread functions, the non-
|
||
standard, NuttX function sched_lock() is used. This is very
|
||
strong since it disables pre-emption for all threads in all
|
||
task groups. I believe it is only really necessary in most
|
||
cases to lock threads in the task group with a new non-
|
||
standard interface, say pthread_lock().
|
||
|
||
This is because the OS resources used by a thread such as
|
||
mutexes, condition variable, barriers, etc. are only
|
||
meaningful from within the task group. So, in order to
|
||
performance exclusive operations on these resources, it is
|
||
only necessary to block other threads executing within the
|
||
task group.
|
||
|
||
This is an easy change: pthread_lock() and pthread_unlock()
|
||
would simply operate on a semaphore retained in the task
|
||
group structure. I am, however, hesitant to make this change:
|
||
In the FLAT build model, there is nothing that prevents people
|
||
from accessing the inter-thread controls from threads in
|
||
different task groups. Making this change, while correct,
|
||
might introduce subtle bugs in code by people who are not
|
||
using NuttX correctly.
|
||
Status: Open
|
||
Priority: Low. This change would improve real-time performance of the
|
||
OS but is not otherwise required.
|
||
|
||
o Message Queues (sched/mqueue)
|
||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||
|
||
o Work Queues (sched/wqueue)
|
||
^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||
|
||
Title: WORK QUEUE DELAY INACCURACIES
|
||
Description: Each queued work may have an optional delay value associated
|
||
with it. That delay should be respect to the time that the
|
||
work is queued. However, since we do not know the time the
|
||
work is queue, the actual delay will be respect to the time
|
||
that the work is processed. Under certain conditions, the
|
||
work may sit in the queue for some time before it is
|
||
processed, leading to an inaccuracy in the delay.
|
||
|
||
One solution might involved saving the time when in the work
|
||
structure when the work is queued. Then the delay logic can
|
||
take the difference between the processing time and the
|
||
queued time to get a more accurate delay.
|
||
Status: Open
|
||
Priority: In all known use cased, the priority is low. A problem is
|
||
would only occur if the work queue is overload or if work in
|
||
the work queue suspends waiting for a resource (both of which
|
||
are much bigger problems).
|
||
|
||
o Kernel/Protected Build
|
||
^^^^^^^^^^^^^^^^^^^^^^
|
||
|
||
Title: C++ CONSTRUCTORS HAVE TOO MANY PRIVILEGES (PROTECTED MODE)
|
||
Description: When a C++ ELF module is loaded, its C++ constructors are called
|
||
via sched/task_starthook.c logic. This logic runs in protected mode.
|
||
The is a security hole because the user code runs with kernel-
|
||
privileges when the constructor executes.
|
||
|
||
Destructors likely have the opposite problem. The probably try to
|
||
execute some kernel logic in user mode? Obviously this needs to
|
||
be investigated further.
|
||
Status: Open
|
||
Priority: Low (unless you need build a secure C++ system).
|
||
|
||
Title: TOO MANY SYSCALLS
|
||
Description: There are a few syscalls that operate very often in user space.
|
||
Since syscalls are (relatively) time consuming this could be
|
||
a performance issue. Here is some numbers that I collected
|
||
in an application that was doing mostly printf output:
|
||
|
||
sem_post - 18% of syscalls
|
||
sem_wait - 18% of syscalls
|
||
getpid - 59% of syscalls
|
||
--------------------------
|
||
95% of syscalls
|
||
|
||
Obviously system performance could be improved greatly by simply
|
||
optimizing these functions so that they do not need to system calls
|
||
so frequently. This getpid() call is part of the re-entrant
|
||
semaphore logic used with printf() and other C buffered I/O.
|
||
Something like TLS might be used to retain the thread's ID
|
||
locally.
|
||
|
||
Linux, for example, has functions call up() and down(). up()
|
||
increments the semaphore count but does not call into the kernel
|
||
unless incrementing the count unblocks a task; similarly, down
|
||
decrements the count and does not call into the kernel unless
|
||
the count becomes negative the caller must be blocked.
|
||
|
||
Update:
|
||
"I am thinking that there should be a "magic" global, user-
|
||
accessible variable that holds the PID of the currently
|
||
executing thread; basically the PID of the task at the head
|
||
of the ready-to-run list. This variable would have to be reset
|
||
each time the head of the ready-to-run list changes.
|
||
|
||
"Then getpid() could be implemented in user space with no system call
|
||
by simply reading this variable.
|
||
|
||
"This one would be easy: Just a change to include/nuttx/userspace.h,
|
||
boards/<arch>/<chip>/<board>/kernel/up_userspace.c, libs/libc/,
|
||
sched/sched_addreadytorun.c, and sched/sched_removereadytorun.c.
|
||
That would eliminate 59% of the syscalls."
|
||
|
||
Update:
|
||
This is probably also just a symptom of the OS test that does mostly
|
||
console output. The requests for the pid() are part of the
|
||
implementation of the I/O's re-entrant semaphore implementation and
|
||
would not be an issue in the more general case.
|
||
|
||
Update:
|
||
One solution might be to use TLS, add the PID to struct
|
||
tls_info_s. Then the PID could be obtained without a system call.
|
||
TLS is not very useful in the FLAT build, however. TLS works by
|
||
putting per-thread data at the bottom of an aligned stack. The
|
||
current stack pointer is then ANDed with the alignment mask to
|
||
obtain the per-thread data address.
|
||
|
||
There are problems with this in the FLAT and PROTECTED builds:
|
||
First the maximum size of the stack is limited by the number
|
||
of bits in the mask. This means that you need to have a very
|
||
high alignment to support tasks with large stacks. But
|
||
secondly, the higher the alignment of the stacks stacks, the
|
||
more memory is lost to fragmentation.
|
||
|
||
In the KERNEL build, the the stack lies at a virtual address
|
||
and it is possible to have highly aligned stacks with no such
|
||
penalties.
|
||
Status: Open
|
||
Priority: Low-Medium. Right now, I do not know if these syscalls are a
|
||
real performance issue or not. The above statistics were collected
|
||
from a an atypical application (the OS test), and does an excessive
|
||
amount of console output. There is probably no issue with more typical
|
||
embedded applications.
|
||
|
||
Title: SECURITY ISSUES
|
||
Description: In the current designed, the kernel code calls into the user-space
|
||
allocators to allocate user-space memory. It is a security risk to
|
||
call into user-space in kernel-mode because that could be exploited
|
||
to gain control of the system. That could be fixed by dropping to
|
||
user mode before trapping into the memory allocators; the memory
|
||
allocators would then need to trap in order to return (this is
|
||
already done to return from signal handlers; that logic could be
|
||
renamed more generally and just used for a generic return trap).
|
||
|
||
Another place where the system calls into the user code in kernel
|
||
mode is work_usrstart() to start the user work queue. That is
|
||
another security hole that should be plugged.
|
||
Status: Open
|
||
Priority: Low (unless security becomes an issue).
|
||
|
||
Title: MICRO-KERNEL
|
||
Description: The initial kernel build cut many interfaces at a very high level.
|
||
The resulting monolithic kernel is then rather large. It would
|
||
not be a prohibitively large task to reorganize the interfaces so
|
||
that NuttX is built as a micro-kernel, i.e., with only the core
|
||
OS services within the kernel and with other OS facilities, such
|
||
as the file system, message queues, etc., residing in user-space
|
||
and to interfacing with those core OS facilities through traps.
|
||
Status: Open
|
||
Priority: Low. This is a good idea and certainly an architectural
|
||
improvement. However, there is no strong motivation now do
|
||
do that partitioning work.
|
||
|
||
Title: USER MODE TASKS CAN MODIFY PRIVILEGED TASKS
|
||
Description: Certain interfaces, such as sched_setparam(),
|
||
sched_setscheduler(), etc. can be used by user mode tasks to
|
||
modify the behavior of privileged kernel threads.
|
||
For a truly secure system. Privileges need to be checked in
|
||
every interface that permits one thread to modify the
|
||
properties of another thread.
|
||
|
||
NOTE: It would be a simple matter to simply disable user
|
||
threads from modifying privileged threads. However, you
|
||
might also want to be able to modify privileged threads from
|
||
user tasks with certain permissions. Permissions is a much
|
||
more complex issue.
|
||
|
||
task_delete(), for example, is not permitted to kill a kernel
|
||
thread. But should not a privileged user task be able to do
|
||
so?
|
||
Status: Open
|
||
Priority: Low for most embedded systems but would be a critical need if
|
||
NuttX were used in a secure system.
|
||
|
||
Title: SIGNAL ACTION VULNERABILITY
|
||
Description: When a signal action is performed, the user stack is used.
|
||
Unlike Linux, applications do not have separate user and
|
||
supervisor stacks; everything is done on the user stack.
|
||
|
||
In the implementation of up_sigdeliver(), a copy of the
|
||
register contents that will be restored is present on the
|
||
stack and could be modified by the user application. Thus,
|
||
if the user mucks with the return stack, problems could
|
||
occur when the user task returns to supervisor mode from
|
||
the the signal handler.
|
||
|
||
A recent commit (3 Feb 2019) does protect the status register
|
||
and return address so that a malicious task cannot change the
|
||
return address or switch to supervisor mode. Other register
|
||
are still modifiable so there is other possible mayhem that
|
||
could be done.
|
||
|
||
A better solution, in lieu of a kernel stack, would be to
|
||
eliminate the stack-based register save area altogether and,
|
||
instead, save the registers in another, dedicated state save
|
||
area in the TCB. The only hesitation to this option is that
|
||
it would significantly increase the size of the TCB structure
|
||
and, hence, the per-thread memory overhead.
|
||
Status: Open
|
||
Priority: Medium-ish if are attempting to make a secure environment that
|
||
may host malicious code. Very low for the typical FLAT build,
|
||
however.
|
||
|
||
o C++ Support
|
||
^^^^^^^^^^^
|
||
|
||
Title: STATIC CONSTRUCTORS AND MULTITASKING
|
||
Description: The logic that calls static constructors operates on the main
|
||
thread of the initial user application task. Any static
|
||
constructors that cache task/thread specific information such
|
||
as C streams or file descriptors will not work in other tasks.
|
||
See also UCLIBC++ AND STATIC CONSTRUCTORS below.
|
||
Status: Open
|
||
Priority: Low and probably will not changed. In these case, there will
|
||
need to be an application specific solution.
|
||
|
||
Title: UCLIBC++ AND STATIC CONSTRUCTORS
|
||
uClibc++ was designed to work in a Unix environment with
|
||
processes and with separately linked executables. Each process
|
||
has its own, separate uClibc++ state. uClibc++ would be
|
||
instantiated like this in Linux:
|
||
|
||
1) When the program is built, a tiny start-up function is
|
||
included at the beginning of the program. Each program has
|
||
its own, separate list of C++ constructors.
|
||
|
||
2) When the program is loaded into memory, space is set aside
|
||
for uClibc's static objects and then this special start-up
|
||
routine is called. It initializes the C library, calls all
|
||
of the constructors, and calls atexit() so that the destructors
|
||
will be called when the process exits.
|
||
|
||
In this way, you get a per-process uClibc++ state since there
|
||
is per-process storage of uClibc++ global state and per-process
|
||
initialization of uClibc++ state.
|
||
|
||
Compare this to how NuttX (and most embedded RTOSs) would work:
|
||
|
||
1) The entire FLASH image is built as one big blob. All of the
|
||
constructors are lumped together and all called together at
|
||
one time.
|
||
|
||
This, of course, does not have to be so. We could segregate
|
||
constructors by some criteria and we could use a task start
|
||
up routine to call constructors separately. We could even
|
||
use ELF executables that are separately linked and already
|
||
have their constructors separately called when the ELF
|
||
executable starts.
|
||
|
||
But this would not do you very much good in the case of
|
||
uClibc++ because:
|
||
|
||
2) NuttX does not support processes, i.e., separate address
|
||
environments for each task. As a result, the scope of global
|
||
data is all tasks. Any change to the global state made by
|
||
one task can effect another task. There can only one
|
||
uClibc++ state and it will be shared by all tasks. uClibc++
|
||
apparently relies on global instances (at least for cin and
|
||
cout) there is no way to have any unique state for any
|
||
"task group".
|
||
|
||
[NuttX does not support processes because in order to have
|
||
true processes, your hardware must support a memory management
|
||
unit (MMU) and I am not aware of any mainstream MCU that has
|
||
an MMU (or, at least an MMU that is capable enough to support
|
||
processes).]
|
||
|
||
NuttX does not have processes, but it does have "task groups".
|
||
See https://cwiki.apache.org/confluence/display/NUTTX/Tasks+vs.+Threads+FAQ.
|
||
A task group is the task plus all of the pthreads created by
|
||
the task via pthread_create(). Resources like FILE streams
|
||
are shared within a task group. Task groups are like a poor
|
||
man's process.
|
||
|
||
This means that if the uClibc++ static classes are initialized
|
||
by one member of a task group, then cin/cout should work
|
||
correctly with all threads that are members of task group. The
|
||
destructors would be called when the final member of the task
|
||
group exists (if registered via atexit()).
|
||
|
||
So if you use only pthreads, uClibc++ should work very much like
|
||
it does in Linux. If your NuttX usage model is like one process
|
||
with many threads then you have Linux compatibility.
|
||
|
||
If you wanted to have uClibc++ work across task groups, then
|
||
uClibc++ and NuttX would need some extensions. I am thinking
|
||
along the lines of the following:
|
||
|
||
1) There is a per-task group storage are within the RTOS (see
|
||
include/nuttx/sched.h). If we add some new, non-standard APIs
|
||
then uClibc++ could get access to per-task group storage (in
|
||
the spirit of pthread_getspecific() which gives you access to
|
||
per-thread storage).
|
||
|
||
2) Then move all of uClibc++'s global state into per-task group
|
||
storage and add a uClibc++ initialization function that would:
|
||
a) allocate per-task group storage, b) call all of the static
|
||
constructors, and c) register with atexit() to perform clean-
|
||
up when the task group exits.
|
||
|
||
That would be a fair amount of effort. I don't really know what
|
||
the scope of such an effort would be. I suspect that it is not
|
||
large but probably complex.
|
||
|
||
NOTES:
|
||
|
||
1) See STATIC CONSTRUCTORS AND MULTITASKING
|
||
|
||
2) To my knowledge, only some uClibc++ ofstream logic is
|
||
sensitive to this. All other statically initialized classes
|
||
seem to work OK across different task groups.
|
||
Status: Open
|
||
Priority: Low. I have no plan to change this logic now unless there is
|
||
some strong demand to do so.
|
||
|
||
o Binary loaders (binfmt/)
|
||
^^^^^^^^^^^^^^^^^^^^^^^^
|
||
|
||
Title: NXFLAT TESTS
|
||
Description: Not all of the NXFLAT test under apps/examples/nxflat are working.
|
||
Most simply do not compile yet. tests/mutex runs okay but
|
||
outputs garbage on completion.
|
||
|
||
Update: 13-27-1, tests/mutex crashed with a memory corruption
|
||
problem the last time that I ran it.
|
||
Status: Open
|
||
Priority: High
|
||
|
||
Title: ARM UP_GETPICBASE()
|
||
Description: The ARM up_getpicbase() does not seem to work. This means
|
||
the some features like wdog's might not work in NXFLAT modules.
|
||
Status: Open
|
||
Priority: Medium-High
|
||
|
||
Title: NXFLAT READ-ONLY DATA IN RAM
|
||
Description: At present, all .rodata must be put into RAM. There is a
|
||
tentative design change that might allow .rodata to be placed
|
||
in FLASH (see Documentation/NuttXNxFlat.html).
|
||
Status: Open
|
||
Priority: Medium
|
||
|
||
Title: GOT-RELATIVE FUNCTION POINTERS
|
||
Description: If the function pointer to a statically defined function is
|
||
taken, then GCC generates a relocation that cannot be handled
|
||
by NXFLAT. There is a solution described in Documentation/NuttXNxFlat.html,
|
||
by that would require a compiler change (which we want to avoid).
|
||
The simple workaround is to make such functions global in scope.
|
||
Status: Open
|
||
Priority: Low (probably will not fix)
|
||
|
||
Title: USE A HASH INSTEAD OF A STRING IN SYMBOL TABLES
|
||
Description: In the NXFLAT symbol tables... Using a 32-bit hash value instead
|
||
of a string to identify a symbol should result in a smaller footprint.
|
||
Status: Open
|
||
Priority: Low
|
||
|
||
Title: WINDOWS-BASED TOOLCHAIN BUILD
|
||
Description: Windows build issue. Some of the configurations that use NXFLAT have
|
||
the linker script specified like this:
|
||
|
||
NXFLATLDFLAGS2 = $(NXFLATLDFLAGS1) -T$(TOPDIR)/binfmt/libnxflat/gnu-nxflat-gotoff.ld -no-check-sections
|
||
|
||
That will not work for windows-based tools because they require Windows
|
||
style paths. The solution is to do something like this:
|
||
|
||
if ($(CONFIG_CYGWIN_WINTOOL),y)
|
||
NXFLATLDSCRIPT=${cygpath -w $(TOPDIR)/binfmt/libnxflat/gnu-nxflat-gotoff.ld}
|
||
else
|
||
NXFLATLDSCRIPT=$(TOPDIR)/binfmt/libnxflat/gnu-nxflat-gotoff.ld
|
||
endif
|
||
|
||
Then use
|
||
|
||
NXFLATLDFLAGS2 = $(NXFLATLDFLAGS1) -T"$(NXFLATLDSCRIPT)" -no-check-sections
|
||
|
||
Status: Open
|
||
Priority: There are too many references like the above. They will have
|
||
to get fixed as needed for Windows native tool builds.
|
||
|
||
o Network (net/, drivers/net)
|
||
^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||
|
||
Title: LISTENING FOR UDP BROADCASTS
|
||
Description: Incoming UDP broadcast should only be accepted if listening on
|
||
INADDR_ANY(?)
|
||
Status: Open
|
||
Priority: Low
|
||
|
||
Title: CONCURRENT, UNBUFFERED TCP SEND OPERATIONS
|
||
Description: At present, there cannot be two concurrent active TCP send
|
||
operations in progress using the same socket *unless*
|
||
CONFIG_TCP_WRITE_BUFFER. This is because the uIP ACK logic
|
||
will support only one transfer at a time.
|
||
|
||
Such a situation could occur if explicit TCP send operations
|
||
are performed using the same socket (or dup's of the same)
|
||
socket on two different threads. It can also occur implicitly
|
||
when you execute more than one thread over and NSH Telenet
|
||
session.
|
||
|
||
There are two possible solutions:
|
||
|
||
1. Remove option to build the network without write buffering
|
||
enabled. This is is simplest and perhaps the best option.
|
||
Certainly a system can be produced with a smaller RAM
|
||
footprint without write buffering. However, that probably
|
||
does not justify permitted a crippled system.
|
||
|
||
2. Another option is to serialize the non-buffered writes for
|
||
a socket with a mutex. i.e., add a mutex to make sure that
|
||
each send that is started is able to be the exclusive
|
||
sender until all of the data to be sent has been ACKed.
|
||
That can be a very significant delay involving the send,
|
||
waiting for the ACK or a timeout and possible retransmissions!
|
||
|
||
Although it uses more memory, I believe that option 1 is the
|
||
better solution and will avoid difficult TCP bugs in the future.
|
||
|
||
Status: Open.
|
||
Priority: Medium-Low. This is only an important issue for people who
|
||
use multi-threaded, unbuffered TCP networking without a full
|
||
understanding of the issues.
|
||
|
||
Title: POLL/SELECT ON TCP/UDP SOCKETS NEEDS READ-AHEAD
|
||
Description: poll()/select() only works for availability of buffered TCP/UDP
|
||
read data (when read-ahead is enabled). The way writing is
|
||
handled in the network layer, either (1) If CONFIG_UDP/TCP_WRITE_BUFFERS=y
|
||
then we never have to wait to send; otherwise, we always have
|
||
to wait to send. So it is impossible to notify the caller
|
||
when it can send without waiting.
|
||
|
||
An exception "never having to wait" is the case where we are
|
||
out of memory for use in write buffering. In that case, the
|
||
blocking send()/sendto() would have to wait for the memory
|
||
to become available.
|
||
Status: Open, probably will not be fixed.
|
||
Priority: Medium... this does effect porting of applications that expect
|
||
different behavior from poll()/select()
|
||
|
||
Title: INTERFACES TO LEAVE/JOIN IGMP MULTICAST GROUP
|
||
Description: The interfaces used to leave/join IGMP multicast groups is non-standard.
|
||
RFC3678 (IGMPv3) suggests ioctl() commands to do this (SIOCSIPMSFILTER) but
|
||
also status that those APIs are historic. NuttX implements these ioctl
|
||
commands, but is non-standard because: (1) It does not support IGMPv3, and
|
||
(2) it looks up drivers by their device name (e.g., "eth0") vs IP address.
|
||
|
||
Linux uses setsockopt() to control multicast group membership using the
|
||
IP_ADD_MEMBERSHIP and IP_DROP_MEMBERSHIP options. It also looks up drivers
|
||
using IP addresses (It would require additional logic in NuttX to look up
|
||
drivers by IP address). See http://tldp.org/HOWTO/Multicast-HOWTO-6.html
|
||
Status: Open
|
||
Priority: Medium. All standards compatibility is important to NuttX. However, most
|
||
the mechanism for leaving and joining groups is hidden behind a wrapper
|
||
function so that little of this incompatibilities need be exposed.
|
||
|
||
Title: CLOSED CONNECTIONS IN THE BACKLOG
|
||
If a connection is backlogged but accept() is not called quickly, then
|
||
that connection may time out. How should this be handled? Should the
|
||
connection be removed from the backlog if it is times out or is closed?
|
||
Or should it remain in the backlog with a status indication so that accept()
|
||
can fail when it encounters the invalid connection?
|
||
Status: Open
|
||
Priority: Medium. Important on slow applications that will not accept
|
||
connections promptly.
|
||
|
||
Title: IPv6 REQUIRES ADDRESS FILTER SUPPORT
|
||
Description: IPv6 requires that the Ethernet driver support NuttX address
|
||
filter interfaces. Several Ethernet drivers do support there,
|
||
however. Others support the address filtering interfaces but
|
||
have never been verified:
|
||
|
||
C5471, LM3S, ez80, DM0x90 NIC, PIC, LPC54: Do not support
|
||
address filtering.
|
||
Kinetis, LPC17xx, LPC43xx: Untested address filter support
|
||
|
||
Status: Open
|
||
Priority: Pretty high if you want a to use IPv6 on these platforms.
|
||
|
||
Title: UDP MULTICAST RECEPTION
|
||
Description: The logic in udp_input() expects either a single receive socket or
|
||
none at all. However, multiple sockets should be capable of
|
||
receiving a UDP datagram (multicast reception). This could be
|
||
handled easily by something like:
|
||
|
||
for (conn = NULL; conn = udp_active (pbuf, conn); )
|
||
|
||
If the callback logic that receives a packet responds with an
|
||
outgoing packet, then it will over-write the received buffer,
|
||
however. recvfrom() will not do that, however. We would have
|
||
to make that the rule: Recipients of a UDP packet must treat
|
||
the packet as read-only.
|
||
Status: Open
|
||
Priority: Low, unless your logic depends on that behavior.
|
||
|
||
Title: NETWORK WON'T STAY DOWN
|
||
Description: If you enable the NSH network monitor (CONFIG_NSH_NETINIT_MONITOR)
|
||
then the NSH 'ifdown' command is broken. Doing 'nsh> ifconfig eth0'
|
||
will, indeed, bring the network down. However, the network monitor
|
||
notices the change in the link status and will bring the network
|
||
back up. There needs to be some kind of interlock between
|
||
cmd_ifdown() and the network monitor thread to prevent this.
|
||
Status: Open
|
||
Priority: Low, this is just a nuisance in most cases.
|
||
|
||
Title: FIFO CLEAN-UP AFTER CLOSING UNIX DOMAIN DATAGRAM SOCKET
|
||
Description: FIFOs are used as the IPC underlying all local Unix domain
|
||
sockets. In NuttX, FIFOs are implemented as device drivers
|
||
(not as a special FIFO files). The FIFO device driver is
|
||
instantiated when the Unix domain socket communications begin
|
||
and will automatically be released when (1) the driver is
|
||
unlinked and (2) all open references to the driver have been
|
||
closed. But there is no mechanism in place now to unlink the
|
||
FIFO when the Unix domain datagram socket is no longer used.
|
||
The primary issue is timing.. the FIFO should persist until
|
||
it is no longer needed. Perhaps there should be a delayed
|
||
call to unlink() (using a watchdog or the work queue). If
|
||
the driver is re-opened, the delayed unlink could be
|
||
canceled? Needs more thought.
|
||
NOTE: This is not an issue for Unix domain streams sockets:
|
||
The end-of-life of the FIFO is well determined when sockets
|
||
are disconnected and support for that case is fully implemented.
|
||
Status: Open
|
||
Priority: Low for now because I don't have a situation where this is a
|
||
problem for me. If you use the same Unix domain paths, then
|
||
it is not a issue; in fact it is more efficient if the FIFO
|
||
devices persist. But this would be a serious problem if,
|
||
for example, you create new Unix domain paths dynamically.
|
||
In that case you would effectively have a memory leak and the
|
||
number of FIFO instances grow.
|
||
|
||
Title: TCP IPv4-MAPPED IPv6 ADDRESSES
|
||
Description: The UDP implementation in net/udp contains support for Hybrid
|
||
dual-stack IPv6/IPv4 implementations that utilize a special
|
||
class of addresses, the IPv4-mapped IPv6 addresses. You can
|
||
see that UDP implementation in:
|
||
|
||
udp_callback.c:
|
||
ip6_map_ipv4addr(ipv4addr,
|
||
udp_send.c:
|
||
ip6_is_ipv4addr((FAR struct in6_addr*)conn->u.ipv6.raddr)))
|
||
ip6_is_ipv4addr((FAR struct in6_addr*)conn->u.ipv6.raddr))
|
||
in_addr_t raddr = ip6_get_ipv4addr((FAR struct in6_addr*)conn->u.ipv6.raddr);
|
||
|
||
There is no corresponding support for TCP sockets.
|
||
Status: Open
|
||
Priority: Low. I don't know of any issues now, but I am sure that
|
||
someone will encounter this in the future.
|
||
|
||
Title: MISSING netdb INTERFACES
|
||
Description: There is no implementation for many netdb interfaces such as
|
||
getnetbyname(), getprotobyname(), getnameinfo(), etc.
|
||
Status: Open
|
||
Priority: Low
|
||
|
||
Title: ETHERNET WITH MULTIPLE LPWORK THREADS
|
||
Description: Recently, Ethernet drivers were modified to support multiple
|
||
work queue structures. The question was raised: "My only
|
||
reservation would be, how would this interact in the case of
|
||
having CONFIG_STM32_ETHMAC_LPWORK and CONFIG_SCHED_LPNTHREADS
|
||
> 1? Can it be guaranteed that one work item won't be
|
||
interrupted and execution switched to another? I think so but
|
||
am not 100% confident."
|
||
|
||
I suspect that you right. There are probably vulnerabilities
|
||
in the CONFIG_STM32_ETHMAC_LPWORK with CONFIG_SCHED_LPNTHREADS
|
||
> 1 case. But that really doesn't depend entirely upon the
|
||
change to add more work queue structures. Certainly with only
|
||
work queue structure you would have concurrent Ethernet
|
||
operations in that multiple LP threads; just because the work
|
||
structure is available, does not mean that there is not dequeued
|
||
work in progress. The multiple structures probably widens the
|
||
window for that concurrency, but does not create it.
|
||
|
||
The current Ethernet designs depend upon a single work queue to
|
||
serialize data. In the case of multiple LP threads, some
|
||
additional mechanism would have to be added to enforce that
|
||
serialization.
|
||
|
||
NOTE: Most drivers will call net_lock() and net_unlock() around
|
||
the critical portions of the driver work. In that case, all work
|
||
will be properly serialized. This issue only applies to drivers
|
||
that may perform operations that require protection outside of
|
||
the net_lock'ed region. Sometimes, this may require extending
|
||
the netlock() to be beginning of the driver work function.
|
||
|
||
Status: Open
|
||
Priority: High if you happen to be using Ethernet in this configuration.
|
||
|
||
Title: NETWORK DRIVERS USING HIGH PRIORITY WORK QUEUE
|
||
Description: Many network drivers run the network on the high priority work
|
||
queue thread (or support an option to do so). Networking should
|
||
not be done on the high priority work thread because it interferes
|
||
with real-time behavior. Fix by forcing all network drivers to
|
||
run on the low priority work queue.
|
||
Status: Open
|
||
Priority: Low. Not such big deal for demo network test and demo
|
||
configurations except that it provides a bad example for a product
|
||
OS configuration.
|
||
|
||
Title: REPARTITION DRIVER FUNCTIONALITY
|
||
Description: Every network driver performs the first level of packet decoding.
|
||
It examines the packet header and calls ipv4_input(), ipv6_input().
|
||
icmp_input(), etc. as appropriate. This is a maintenance problem
|
||
because it means that any changes to the network input interfaces
|
||
affects all drivers.
|
||
|
||
A better, more maintainable solution would use a single net_input()
|
||
function that would receive all incoming packets. This function
|
||
would then perform that common packet decoding logic that is
|
||
currently implemented in every network driver.
|
||
Status: Open
|
||
Priority: Low. Really just as aesthetic maintainability issue.
|
||
|
||
Title: BROADCAST WITH MULTIPLE NETWORK INTERFACES
|
||
Description: There is currently no mechanism to send a broadcast packet
|
||
out through several network interfaces. Currently packets
|
||
can be sent to only one device. Logic in netdev_findby_ipvXaddr()
|
||
currently just selects the first device in the list of
|
||
devices; only that device will receive broadcast packets.
|
||
Status: Open
|
||
Priority: High if you require broadcast on multiple networks. There is
|
||
no simple solution known at this time, however. Perhaps
|
||
netdev_findby_ipvXaddr() should return a list of devices rather
|
||
than a single device? All upstream logic would then have to
|
||
deal with a list of devices. That would be a huge effect and
|
||
certainly doesn't dount as a "simple solution".
|
||
|
||
Title: ICMPv6 FOR 6LoWPAN
|
||
Description: The current ICMPv6 and neighbor-related logic only works with
|
||
Ethernet MAC. For 6LoWPAN, a new more conservative IPv6
|
||
neighbour discovery is provided by RFC 6775. This RFC needs to
|
||
be supported in order to support ping6 on a 6LoWPAN network.
|
||
If RFC 6775 were implemented, then arbitrary IPv6 addresses,
|
||
including addresses from DHCPv6 could be used.
|
||
|
||
UPDATE: With IPv6 neighbor discovery, any IPv6 address may
|
||
be associated with any short or extended address. In fact,
|
||
that is the whole purpose of the neighbor discover logic: It
|
||
plays the same role as ARP in IPv4; it ultimately just manages
|
||
a neighbor table that, like the arp table, provides the
|
||
mapping between IP addresses and node addresses.
|
||
|
||
The NuttX, Contiki-based 6LoWPAN implementation circumvented
|
||
the need for the neighbor discovery logic by using only MAC-
|
||
based addressing, i.e., the lower two or eight bytes of the
|
||
IP address are the node address.
|
||
|
||
Most of the 6LoWPAN compression algorithms exploit this to
|
||
compress the IPv6 address to nothing but a bit indicating
|
||
that the IP address derives from the node address. So I
|
||
think IPv6 neighbor discover is useless in the current
|
||
implementation.
|
||
|
||
If we want to use IPv6 neighbor discovery, we could dispense
|
||
with the all MAC based addressing. But if we want to retain
|
||
the more compact MAC-based addressing, then we don't need
|
||
IPv6 neighbor discovery.
|
||
|
||
So, the full neighbor discovery logic is not currently useful,
|
||
but it would still be nice to have enough in place to support
|
||
ping6. Full neighbor support would probably be necessary if we
|
||
wanted to route 6LoWPAN frames outside of the WPAN.
|
||
|
||
Status: Open
|
||
Priority: Low for now. I don't plan on implementing this. It would
|
||
only be relevant if we were to decide to abandon the use of
|
||
MAC-based addressing in the 6LoWPAN implementation.
|
||
|
||
Title: ETHERNET LOCAL BROADCAST DOES NOT WORK
|
||
Description: In case of "local broadcast" the system still send ARP
|
||
request to the destination, but it shouldn't, it should
|
||
broadcast. For Example, the system has network with IP
|
||
10.0.0.88, netmask of 255.255.255.0, it should send
|
||
messages for 10.0.0.255 as broadcast, and not send ARP
|
||
for 10.0.0.255
|
||
|
||
For more easier networking, the next line should have give
|
||
me the broadcast address of the network, but it doesn't:
|
||
|
||
ioctl(_socket_fd, SIOCGIFBRDADDR, &bc_addr);
|
||
Status: Open
|
||
Priority: Medium
|
||
|
||
Title: TCP ISSUES WITH QUICK CLOSE
|
||
Description: This failure has been reported in the accept() logic:
|
||
|
||
- psock_tcp_accept() waits on net_lockedwait() below
|
||
- The accept operation completes, the socket is in the connected
|
||
state and psock_accept() is awakened. It cannot run,
|
||
however, because its priority is low and so it is blocked
|
||
from execution.
|
||
- In the mean time, the remote host sends a
|
||
packet which is presumably caught in the read-ahead buffer.
|
||
- Then the remote host closes the socket. Nothing happens on
|
||
the target side because net_start_monitor() has not yet been
|
||
called.
|
||
- Then accept() finally runs, but not with a connected but
|
||
rather with a disconnected socket. This fails when it
|
||
attempts to start the network monitor on the disconnected
|
||
socket below.
|
||
- It is also impossible to read the buffered TCP data from a
|
||
disconnected socket. The TCP recvfrom() logic would also
|
||
need to permit reading buffered data from a disconnected
|
||
socket.
|
||
|
||
This problem was report when the target hosted an FTP server
|
||
and files were being accessed by FileZilla.
|
||
|
||
connect() most likely has this same issue.
|
||
|
||
A work-around might be to raise the priority of the thread
|
||
that calls accept(). accept() might also need to check the
|
||
tcpstateflags in the connection structure before returning
|
||
in order to assure that the socket truly is connected.
|
||
Status: Open
|
||
Priority: Medium. I have never heard of this problem being reported
|
||
before, so I suspect it might not be so prevalent as one
|
||
might expect.
|
||
|
||
Title: LOCAL DATAGRAM RECVFROM RETURNS WRONG SENDER ADDRESS
|
||
Description: The recvfrom logic for local datagram sockets returns the
|
||
incorrect sender "from" address. Instead, it returns the
|
||
receiver's "to" address. This means that returning a reply
|
||
to the "from" address receiver sending a packet to itself.
|
||
Status: Open
|
||
Priority: Medium High. This makes using local datagram sockets in
|
||
anything but a well-known point-to-point configuration
|
||
impossible.
|
||
|
||
o USB (drivers/usbdev, drivers/usbhost)
|
||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||
|
||
Title: USB STORAGE DRIVER DELAYS
|
||
Description: There is a workaround for a bug in drivers/usbdev/usbdev_storage.c.
|
||
that involves delays. This needs to be redesigned to eliminate these
|
||
delays. See logic conditioned on CONFIG_USBMSC_RACEWAR.
|
||
|
||
If queuing of stall requests is supported by the DCD then this workaround
|
||
is not required. In this case, (1) the stall is not sent until all
|
||
write requests preceding the stall request are sent, (2) the stall is
|
||
sent, and then after the stall is cleared, (3) all write requests
|
||
queued after the stall are sent.
|
||
|
||
See, for example, the queuing of pending stall requests in the SAM3/4
|
||
UDP driver at arch/arm/src/sam34/sam_udp.c. There the logic is do this
|
||
is implemented with a normal request queue, a pending request queue, a
|
||
stall flag and a stall pending flag:
|
||
|
||
1) If the normal request queue is not empty when the STALL request is
|
||
received, the stall pending flag is set.
|
||
2) If addition write requests are received while the stall pending flag
|
||
is set (or while waiting for the stall to be sent), those write requests
|
||
go into the pending queue.
|
||
3) When the normal request queue empties successful and all of the write
|
||
transfers complete, the STALL is sent. The stall pending flag is
|
||
cleared and the stall flag is set. Now the endpoint is really stalled.
|
||
4) After the STALL is cleared (via the Clear Feature SETUP), the pending
|
||
request queue is copied to the normal request queue, the stall flag is
|
||
cleared, and normal write request processing resumes.
|
||
|
||
Status: Open
|
||
Priority: Medium
|
||
|
||
Title: EP0 OUT CLASS DATA
|
||
Description: There is no mechanism in place to handle EP0 OUT data transfers.
|
||
There are two aspects to this problem, neither are easy to fix
|
||
(only because of the number of drivers that would be impacted):
|
||
|
||
1. The class drivers only send EP0 write requests and these are
|
||
only queued on EP0 IN by this drivers. There is never a read
|
||
request queued on EP0 OUT.
|
||
2. But EP0 OUT data could be buffered in a buffer in the driver
|
||
data structure. However, there is no method currently
|
||
defined in the USB device interface to obtain the EP0 data.
|
||
|
||
Updates: (1) The USB device-to-class interface as been extended so
|
||
that EP0 OUT data can accompany the SETUP request sent to the
|
||
class drivers. (2) The logic in the STM32 F4 OTG FS device driver
|
||
has been extended to provide this data. Updates are still needed
|
||
to other drivers.
|
||
|
||
Here is an overview of the required changes:
|
||
New two buffers in driver structure:
|
||
|
||
1. The existing EP0 setup request buffer (ctrlreq, 8 bytes)
|
||
2. A new EP0 data buffer to driver state structure (ep0data,
|
||
max packetsize)
|
||
|
||
Add a new state:
|
||
|
||
3. Waiting for EP0 setup OUT data (EP0STATE_SETUP_OUT)
|
||
|
||
General logic flow:
|
||
|
||
1. When an EP0 SETUP packet is received:
|
||
- Read the request into EP0 setup request buffer (ctrlreq,
|
||
8 bytes)
|
||
- If this is an OUT request with data length, set the EP0
|
||
state to EP0STATE_SETUP_OUT and wait to receive data on
|
||
EP0.
|
||
- Otherwise, the SETUP request may be processed now (or,
|
||
in the case of the F4 driver, at the conclusion of the
|
||
SETUP phase).
|
||
2. When EP0 the EP0 OUT DATA packet is received:
|
||
- Verify state is EP0STATE_SETUP_OUT
|
||
- Read the request into the EP0 data buffer (ep0data, max
|
||
packet size)
|
||
- Now process the previously buffered SETUP request along
|
||
with the OUT data.
|
||
3. When the setup packet is dispatched to the class driver,
|
||
the OUT data must be passed as the final parameter in the
|
||
call.
|
||
|
||
Update 2013-9-2: The new USB device-side driver for the SAMA5D3
|
||
correctly supports OUT SETUP data following the same design as
|
||
per above.
|
||
|
||
Update 2013-11-7: David Sidrane has fixed with issue with the
|
||
STM32 F1 USB device driver. Still a few more to go before this
|
||
can be closed out.
|
||
|
||
Status: Open
|
||
Priority: High for class drivers that need EP0 data. For example, the
|
||
CDC/ACM serial driver might need the line coding data (that
|
||
data is not used currently, but it might be).
|
||
|
||
Title: IMPROVED USAGE of STM32 USB RESOURCES
|
||
Description: The STM32 platforms use a non-standard, USB host peripheral
|
||
that uses "channels" to implement data transfers the current
|
||
logic associates each channel with an pipe/endpoint (with two
|
||
channels for bi-directional control endpoints). The OTGFS
|
||
peripheral has 8 channels and the OTGHS peripheral has 12
|
||
channels.
|
||
|
||
This works okay until you add a hub and try connect multiple
|
||
devices. A typical device will require 3-4 pipes and, hence,
|
||
4-5 channels. This effectively prevents using a hub with the
|
||
STM32 devices. This also applies to the EFM32 which uses the
|
||
same IP.
|
||
|
||
It should be possible to redesign the STM32 F4 OTGHS/OTGFS and
|
||
EFM32 host driver so that channels are dynamically assigned to
|
||
pipes as needed for individual transfers. Then you could have
|
||
more "apparent" pipes and make better use of channels.
|
||
Although there are only 8 or 12 channels, transfers are not
|
||
active all of the time on all channels so it ought to be
|
||
possible to have an unlimited number of "pipes" but with no
|
||
more than 8 or 12 active transfers.
|
||
Status: Open
|
||
Priority: Medium-Low
|
||
|
||
Title: USB CDC/ACM HOST CLASS DRIVER
|
||
Description: A CDC/ACM host class driver has been added. This has been
|
||
testing by running the USB CDC/ACM host on an Olimex
|
||
LPC1766STK and using the
|
||
boards/arm/stm32/stm3210e-eval/configs/usbserial
|
||
configuration (using the CDC/ACM device side driver). There
|
||
are several unresolved issues that prevent the host driver
|
||
from being usable:
|
||
|
||
- The driver works fine when configured for reduced or bulk-
|
||
only protocol on the Olimex LPC1766STK.
|
||
|
||
- Testing has not been performed with the interrupt IN channel
|
||
enabled (ie., I have not enabled FLOW control nor do I have
|
||
a test case that used the interrupt IN channel). I can see
|
||
that the polling for interrupt IN data is occurring
|
||
initially.
|
||
|
||
- I test for incoming data by doing 'nsh> cat /dev/ttyACM0' on
|
||
the Olimex LPC1766STK host. The bulk data reception still
|
||
works okay whether or not the interrupt IN channel is enabled.
|
||
If the interrupt IN channel is enabled, then polling of that
|
||
channel appears to stop when the bulk in channel becomes
|
||
active.
|
||
|
||
- The RX reception logic uses the low priority work queue.
|
||
However, that logic never returns and so blocks other use of
|
||
the work queue thread. This is probably okay but means that
|
||
the RX reception logic probably should be moved to its own
|
||
dedicated thread.
|
||
|
||
- I get crashes when I run with the STM32 OTGHS host driver.
|
||
Apparently the host driver is trashing memory on receipt
|
||
of data.
|
||
|
||
UPDATE: This behavior needs to be retested with:
|
||
commit ce2845c5c3c257d081f624857949a6afd4a4668a
|
||
Author: Janne Rosberg <janne.rosberg@offcode.fi>
|
||
Date: Tue Mar 7 06:58:32 2017 -0600
|
||
|
||
usbhost_cdcacm: fix tx outbuffer overflow and remove now
|
||
invalid assert
|
||
|
||
commit 3331e9c49aaaa6dcc3aefa6a9e2c80422ffedcd3
|
||
Author: Janne Rosberg <janne.rosberg@offcode.fi>
|
||
Date: Tue Mar 7 06:57:06 2017 -0600
|
||
|
||
STM32 OTGHS host: stm32_in_transfer() fails and returns NAK
|
||
if a short transfer is received. This causes problems from
|
||
class drivers like CDC/ACM where short packets are expected.
|
||
In those protocols, any transfer may be terminated by sending
|
||
short or NUL packet.
|
||
|
||
commit 0631c1aafa76dbaa41b4c37e18db98be47b60481
|
||
Author: Gregory Nutt <gnutt@nuttx.org>
|
||
Date: Tue Mar 7 07:17:24 2017 -0600
|
||
|
||
STM32 OTGFS, STM32 L4 and F7: Adapt Janne Rosberg's patch to
|
||
STM32 OTGHS host to OTGFS host, and to similar implements for
|
||
L4 and F7.
|
||
|
||
- The SAMA5D EHCI and the LPC31 EHCI drivers both take semaphores
|
||
in the cancel method. The current CDC/ACM class driver calls
|
||
the cancel() method from an interrupt handler. This will
|
||
cause a crash. Those EHCI drivers should be redesigned to
|
||
permit cancellation from the interrupt level.
|
||
|
||
Most of these problems are unique to the Olimex LPC1766STK
|
||
DCD; some are probably design problems in the CDC/ACM host
|
||
driver. The bottom line is that the host CDC/ACM driver is
|
||
still immature and you could experience issues in some
|
||
configurations if you use it.
|
||
|
||
That all being said, I know of no issues with the current
|
||
CDC/ACM driver on the Olimex LPC1766STK platform if the interrupt
|
||
IN endpoint is not used, i.e., in "reduced" mode. The only loss
|
||
of functionality is output flow control.
|
||
|
||
UPDATE: The CDC/ACM class driver may also now be functional on
|
||
the STM32. That needs to be verified.
|
||
|
||
Status: Open
|
||
Priority: Medium-Low unless you really need host CDC/ACM support.
|
||
|
||
o Libraries (libs/libc/, libs/libm/)
|
||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||
|
||
Title: SIGNED time_t
|
||
Description: The NuttX time_t is type uint32_t. I think this is consistent
|
||
with all standards and with normal usage of time_t. However,
|
||
according to Wikipedia, time_t is usually implemented as a
|
||
signed 32-bit value.
|
||
Status: Open
|
||
Priority: Very low unless there is some compelling issue that I do not
|
||
know about.
|
||
|
||
Title: ENVIRON
|
||
Description: The definition of environ in stdlib.h is bogus and will not
|
||
work as it should. This is because the underlying
|
||
representation of the environment is not an array of pointers.
|
||
Status: Open
|
||
Priority: Medium
|
||
|
||
Title: TERMIOS
|
||
Description: Need some minimal termios support... at a minimum, enough to
|
||
switch between raw and "normal" modes to support behavior like
|
||
that needed for readline().
|
||
UPDATE: There is growing functionality in libs/libc/termios/
|
||
and in the ioctl methods of several MCU serial drivers (stm32,
|
||
lpc43, lpc17, pic32, and others). However, as phrased, this
|
||
bug cannot yet be closed since this "growing functionality"
|
||
does not address all termios.h functionality and not all
|
||
serial drivers support termios.
|
||
Status: Open
|
||
Priority: Low
|
||
|
||
Title: CONCURRENT STREAM READ/WRITE
|
||
Description: NuttX only supports a single file pointer so reads and writes
|
||
must be from the same position. This prohibits implementation
|
||
of behavior like that required for fopen() with the "a+" mode.
|
||
According to the fopen man page:
|
||
|
||
"a+ Open for reading and appending (writing at end of file).
|
||
The file is created if it does not exist. The initial file
|
||
position for reading is at the beginning of the file, but
|
||
output is always appended to the end of the file."
|
||
|
||
At present, the single NuttX file pointer is positioned to the
|
||
end of the file for both reading and writing.
|
||
Status: Open
|
||
Priority: Medium. This kind of operation is probably not very common in
|
||
deeply embedded systems but is required by standards.
|
||
|
||
Title: DIVIDE BY ZERO
|
||
Description: This is bug 3468949 on the SourceForge website (submitted by
|
||
Philipp Klaus Krause):
|
||
"lib_strtod.c does contain divisions by zero in lines 70 and 96.
|
||
AFAIK, unlike for Java, division by zero is not a reliable way to
|
||
get infinity in C. AFAIK compilers are allowed e.g. give a compile-
|
||
time error, and some, such as sdcc, do. AFAIK, C implementations
|
||
are not even required to support infinity. In C99 the macro isinf()
|
||
could replace the first use of division by zero. Unfortunately, the
|
||
macro INFINITY from math.h probably can't replace the second division
|
||
by zero, since it will result in a compile-time diagnostic, if the
|
||
implementation does not support infinity."
|
||
Status: Open
|
||
Priority:
|
||
|
||
Title: OLD dtoa NEEDS TO BE UPDATED
|
||
Description: This implementation of dtoa in libs/libc/stdio is old and will not
|
||
work with some newer compilers. See
|
||
http://patrakov.blogspot.com/2009/03/dont-use-old-dtoac.html
|
||
Update: A new dtoa version is not available and enabled with
|
||
CONFIG_NANO_PRINF. However, the old version of dtoa is still in
|
||
in place and lib_libvsprintf() has been dupliated. I think this
|
||
issue should remain open until the implementations have been
|
||
unified.
|
||
Status: Open
|
||
Priority: ??
|
||
|
||
Title: FLOATING POINT FORMATS
|
||
Description: Only the %f floating point format is supported. Others are
|
||
accepted but treated like %f.
|
||
Update: %g is supported with CONFIG_NANO_PRINTF.
|
||
Status: Open
|
||
Priority: Medium (this might important to someone).
|
||
|
||
Title: LIBM INACCURACIES
|
||
Description: "..if you are writing something like robot control or
|
||
inertial navigation system for aircraft, I have found
|
||
that using the toolchain libmath is only safe option.
|
||
I ported some code for converting quaternions to Euler
|
||
angles to NuttX for my project and only got it working
|
||
after switching to newlib math library.
|
||
|
||
"NuttX does not fully implement IEC 60559 floating point
|
||
from C99 (sections marked [MX] in OpenGroup specs) so if
|
||
your code assumes that some function, say pow(), actually
|
||
behaves right for all the twenty or so odd corner cases
|
||
that the standards committees have recently specified,
|
||
you might get surprises. I'd expect pow(0.0, 1.0) to
|
||
return 0.0 (as zero raised to any positive power is
|
||
well-defined in mathematics) but I get +Inf.
|
||
|
||
"NuttX atan2(-0.0, -1.0) returns +M_PI instead of correct
|
||
-M_PI. If we expect [MX] functionality, then atan2(Inf, Inf)
|
||
should return M_PI/4, instead NuttX gives NaN.
|
||
|
||
"asin(2.0) does not set domain error or return NaN. In fact
|
||
it does not return at all as the loop in it does not
|
||
converge, hanging your app.
|
||
|
||
"There are likely many other issues like these as the Rhombus
|
||
OS code has not been tested or used that much. Sorry for not
|
||
providing patches, but we found it easier just to switch the
|
||
math library."
|
||
|
||
UPDATE: 2015-09-01: A fix for the noted problems with asin()
|
||
has been applied.
|
||
2016-07-30: Numerous fixes and performance improvements from
|
||
David Alessio.
|
||
|
||
Status: Open
|
||
Priority: Low for casual users but clearly high if you need care about
|
||
these incorrect corner case behaviors in the math libraries.
|
||
|
||
Title: REPARTITION LIBC FUNCTIONALITY
|
||
Description: There are many things implemented within the kernel (for example
|
||
under sched/pthread) that probably should be migrated in the
|
||
C library where it belongs.
|
||
|
||
I would really like to see a little flavor of a micro-kernel
|
||
at the OS interface: I would like to see more primitive OS
|
||
system calls with more higher level logic in the C library.
|
||
|
||
One awkward thing is the incompatibility of KERNEL vs FLAT
|
||
builds: In the kernel build, it would be nice to move many
|
||
of the thread-specific data items out of the TCB and into
|
||
the process address environment where they belong. It is
|
||
difficult to make this compatible with the FLAT build,
|
||
however.
|
||
Status: Open
|
||
Priority: Low
|
||
|
||
o File system / Generic drivers (fs/, drivers/)
|
||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||
|
||
NOTE: The NXFFS file system has its own TODO list at nuttx/fs/nxffs/README.txt
|
||
|
||
Title: MISSING FILE SYSTEM FEATURES
|
||
Description: Implement missing file system features:
|
||
|
||
chmod() is probably not relevant since file modes are not
|
||
currently supported.
|
||
|
||
File privileges would also be good to support. But this is
|
||
really a small part of a much larger feature. NuttX has no
|
||
user IDs, there are no groups, there are no privileges
|
||
associated with either. User's don't need credentials.
|
||
This is really a system wide issues of which chmod is only
|
||
a small part.
|
||
|
||
User privileges never seemed important to me since NuttX is
|
||
intended for deeply embedded environments where there are
|
||
not multiple users with varying levels of trust.
|
||
|
||
link, unlink, softlink, readlink - For symbolic links. Only
|
||
the ROMFS file system currently supports hard and soft links,
|
||
so this is not too important. The top-level, pseudo-file
|
||
system supports soft links.
|
||
|
||
File locking
|
||
|
||
Special files - NuttX support special files only in the top-
|
||
level pseudo file system. Unix systems support many
|
||
different special files via mknod(). This would be
|
||
important only if it is an objective of NuttX to become a
|
||
true Unix OS. Again only supported by ROMFS.
|
||
|
||
True inodes - Standard Unix inodes. Currently only supported
|
||
by ROMFs.
|
||
|
||
File times, for example as set by utimes().
|
||
|
||
The primary obstacle to all these is that each would require
|
||
changes to all existing file systems. That number is pretty
|
||
large. The number of file system implementations that would
|
||
need to be reviewed and modified As of this writing this
|
||
would include binfs, fat, hostfs, nfs, nxffs, procfs, romfs,
|
||
tmpfs, unionfs, plus pseduo-file system support.
|
||
|
||
Status: Open
|
||
Priority: Low
|
||
|
||
Title: ROMFS CHECKSUMS
|
||
Description: The ROMFS file system does not verify checksums on either
|
||
volume header on on the individual files.
|
||
Status: Open
|
||
Priority: Low. I have mixed feelings about if NuttX should pay a
|
||
performance penalty for better data integrity.
|
||
|
||
Title: SPI-BASED SD MULTIPLE BLOCK TRANSFERS
|
||
Description: The simple SPI based MMCS/SD driver in fs/mmcsd does not
|
||
yet handle multiple block transfers.
|
||
Status: Open
|
||
Priority: Medium-Low
|
||
|
||
Title: SDIO-BASED SD READ-AHEAD/WRITE BUFFERING INCOMPLETE
|
||
Description: The drivers/mmcsd/mmcsd_sdio.c driver has hooks in place to
|
||
support read-ahead buffering and write buffering, but the logic
|
||
is incomplete and untested.
|
||
Status: Open
|
||
Priority: Low
|
||
|
||
Title: POLLHUP SUPPORT
|
||
Description: All drivers that support the poll method should also report
|
||
POLLHUP event when the driver is closed.
|
||
Status: Open
|
||
Priority: Medium-Low
|
||
|
||
Title: UNIFIED DESCRIPTOR REPRESENTATION
|
||
Description: There are two separate ranges of descriptors for file and
|
||
socket descriptors: if a descriptor is in one range then it is
|
||
recognized as a file descriptor; if it is in another range
|
||
then it is recognized as a socket descriptor. These separate
|
||
descriptor ranges can cause problems, for example, they make
|
||
dup'ing descriptors with dup2() problematic. The two groups
|
||
of descriptors are really indices into two separate tables:
|
||
On an array of file structures and the other an array of
|
||
socket structures. There really should be one array that
|
||
is a union of file and socket descriptors. Then socket and
|
||
file descriptors could lie in the same range.
|
||
|
||
Another example of how the current implementation limits
|
||
functionality: I recently started to implement of the FILEMAX
|
||
(using pctl() instead sysctl()). My objective was to be able
|
||
to control the number of available file descriptors on a task-
|
||
by-task basis. The complexity due to the partitioning of
|
||
descriptor space into a range for file descriptors and a range
|
||
for socket descriptors made this feature nearly impossible to
|
||
implement.
|
||
Status: Open
|
||
Priority: Low
|
||
|
||
Title: DUPLICATE FAT FILE NAMES
|
||
Description: "The NSH and POSIX API interpretations about sensitivity or
|
||
insensitivity to upper/lowercase file names seem to be not
|
||
consistent in our usage - which can result in creating two
|
||
directories with the same name..."
|
||
|
||
Example using NSH:
|
||
|
||
nsh> echo "Test1" >/tmp/AtEsT.tXt
|
||
nsh> echo "Test2" >/tmp/aTeSt.TxT
|
||
nsh> ls /tmp
|
||
/tmp:
|
||
AtEsT.tXt
|
||
aTeSt.TxT
|
||
nsh> cat /tmp/aTeSt.TxT
|
||
Test2
|
||
nsh> cat /tmp/AtEsT.tXt
|
||
Test1
|
||
|
||
Status: Open
|
||
Priority: Low
|
||
|
||
Title: MISSING FILES IN NSH 'LS' OF A DIRECTORY
|
||
Description: I have seen cases where (1) long file names are enabled,
|
||
but (2) a short file name is created like:
|
||
|
||
nsh> echo "This is another test" >/mnt/sdcard/another.txt
|
||
|
||
But then on subsequent 'ls' operations, the file does not appear:
|
||
|
||
nsh> ls -l /mnt/sdcard
|
||
|
||
I have determined that the problem is because, for some as-
|
||
of-yet-unknown reason the short file name is treated as a long
|
||
file name. The name then fails the long filename checksum
|
||
test and is skipped.
|
||
|
||
readdir() (and fat_readdir()) is the logic underlying the
|
||
failure and the problem appears to be something unique to the
|
||
fat_readdir() implementation. Why? Because the file is
|
||
visible when you put the SD card on a PC and because this
|
||
works fine:
|
||
|
||
nsh> ls -l /mnt/sdcard/another.txt
|
||
|
||
The failure does not happen on all short file names. I do
|
||
not understand the pattern. But I have not had the opportunity
|
||
to dig into this deeply.
|
||
Status: Open
|
||
Priority: Perhaps not a problem??? I have analyzed this problem and
|
||
I am not sure what to do about it. I am suspected that a
|
||
fat filesystem was used with a version of NuttX that does
|
||
not support long file name entries. Here is the failure
|
||
scenario:
|
||
|
||
1) A file with a long file name is created under Windows.
|
||
2) Then the file is deleted. I am not sure if Windows or
|
||
NuttX deleted the file, but the resulting directory
|
||
content is not compatible with NuttX with long file
|
||
name support.
|
||
|
||
The file deletion left the full sequence of long
|
||
file name entries intact but apparently delete only
|
||
the following short file name entry. I am thinking
|
||
that this might have happened because a version of NuttX
|
||
with only short file name support was used to delete
|
||
the file.
|
||
|
||
3) When a new file with a short file name was created, it
|
||
re-used the short file name entry that was previously
|
||
deleted. This makes the new short file name entry
|
||
look like a part of the long file name.
|
||
|
||
4) When comparing the checksum in the long file name
|
||
entry with the checksum of the short file name, the
|
||
checksum fails and the entire directory sequence is
|
||
ignored by readdir() logic. This is why the file does
|
||
not appear in the 'ls'.
|
||
|
||
Title: SILENT SPIFFS FILE TRUNCATION
|
||
Description: Under certain corner case conditions, SPIFFS will truncate
|
||
files. All of the writes to the file will claim that the
|
||
data has been written but after the file is closed, it may
|
||
be a little shorter than expected.
|
||
|
||
This is due to how the caching is implemented in SPIFFS:
|
||
|
||
1. On each write, the data is not written to the FLASH but
|
||
rather to an internal cache in memory.
|
||
2. When the a write causes the cache to become full, the
|
||
content of cache is flushed to memory. If that flush
|
||
fails because the FLASH has become full, write will
|
||
return the file system full error (ENOSPC).
|
||
3. The cache is also flushed when the file is closed (or
|
||
when fsync() is called). These will also fail if the
|
||
file system becomes full.
|
||
|
||
The problem is when the file is closed, the final file
|
||
size could be smaller than the number of successful writes
|
||
to the file.
|
||
|
||
This error is probably not so significant in a real world
|
||
file system usage: It requires that you write continuously
|
||
to SPIFFS, never deleting files or freeing FLASH resources
|
||
in any way. And it requires the unlikely circumstance that
|
||
the final file written has its last few hundred bytes in
|
||
cache when the file is closed but there are even fewer bytes
|
||
available on the FLASH. That would be rare with a cache
|
||
size of a few hundred bytes and very large serial FLASH.
|
||
|
||
This issue does cause the test at apps/testing/fstest to
|
||
fail. That test fails with a "Partial Read" because the
|
||
file being read is smaller than number bytes written to the
|
||
file. That test does write small files continuously until
|
||
file system is full and even the the error is rare. The
|
||
boards/sim/sim/sim/configs/spiffs test can used to
|
||
demonstrate the error.
|
||
Status: Open
|
||
Priority: Medium. It is certain a file system failure, but I think that
|
||
the exposure in real world uses cases is very small.
|
||
|
||
Title: FAT: CAN'T SEEK TO END OF FILE IF READ-ONLY
|
||
Description: If the size of the underlying file is an exact multiple of the
|
||
FAT cluster size, then you cannot seek to the end of the file
|
||
if the file was opened read-only. In that case, the FAT lseek
|
||
logic will return ENOSPC.
|
||
|
||
This is because seeking to the end of the file involves seeking
|
||
to an offset that is the size of the file (number of bytes
|
||
allocated for file + 1). In order to seek to a position, the
|
||
current FAT implementation insists that there be allocated file
|
||
space at the seek position. Seeking beyond the end of the file
|
||
has the side effect of extending the file.
|
||
|
||
[NOTE: This automatic extension of the file cluster allocation
|
||
is probably unnecessary and another issue of its own.]
|
||
|
||
For example, suppose you have a cluster size that is 4096 bytes
|
||
and a file that is 8192 bytes long. Then the file will consist
|
||
of 2 allocated clusters at offsets 0 through 8191.
|
||
|
||
If the file is opened O_RDWR or O_WRONLY, then the statement:
|
||
|
||
offset = lseek(fd, 0, SET_SEEK);
|
||
|
||
will seek to offset 8192 which beyond the end of the file so a
|
||
new (empty) cluster will be added. Now the file consists of
|
||
three clusters and the file position refers to the first byte of
|
||
the third cluster.
|
||
|
||
If the file is open O_RDONLY, however, then that same lseek
|
||
statement will fail. It is not possible to seek to position
|
||
8192. That is beyond the end of the allocated cluster chain
|
||
and since the file is read-only, it is not permitted to extend
|
||
the cluster chain. Hence, the error ENOSPC is returned.
|
||
|
||
This code snippet will duplicate the problem. It assumes a
|
||
cluster size of 512 and that /tmp is a mounted FAT file system:
|
||
|
||
#define BUFSIZE 1024 //8192, depends on cluster size
|
||
static char buffer[BUFSIZE];
|
||
|
||
#if defined(BUILD_MODULE)
|
||
int main(int argc, FAR char *argv[])
|
||
#else
|
||
int hello_main(int argc, char *argv[])
|
||
#endif
|
||
{
|
||
ssize_t nwritten;
|
||
off_t pos;
|
||
int fd;
|
||
int ch;
|
||
int i;
|
||
|
||
for (i = 0, ch = ' '; i < BUFSIZE; i++)
|
||
{
|
||
buffer[i] = ch;
|
||
|
||
if (++ch == 0x7f)
|
||
{
|
||
ch = ' ';
|
||
}
|
||
}
|
||
|
||
fd = open("/tmp/testfile", O_WRONLY | O_CREAT | O_TRUNC, 0644);
|
||
if (fd < 0)
|
||
{
|
||
printf("open failed: %d\n", errno);
|
||
return 1;
|
||
}
|
||
|
||
nwritten = write(fd, buffer, BUFSIZE);
|
||
if (nwritten < 0)
|
||
{
|
||
printf("write failed: %d\n", errno);
|
||
return 1;
|
||
}
|
||
|
||
close(fd);
|
||
|
||
fd = open("/tmp/testfile", O_RDONLY);
|
||
if (fd < 0)
|
||
{
|
||
printf("open failed: %d\n", errno);
|
||
return 1;
|
||
}
|
||
|
||
pos = lseek(fd, 0, SEEK_END);
|
||
if (pos < 0)
|
||
{
|
||
printf("lseek failed: %d\n", errno);
|
||
return 1;
|
||
}
|
||
else if (pos != BUFSIZE)
|
||
{
|
||
printf("lseek failed: %d\n", pos);
|
||
return 1;
|
||
}
|
||
|
||
close(fd);
|
||
return 0;
|
||
}
|
||
|
||
Status: Open
|
||
Priority: Medium. Although this is a significant design error, the problem
|
||
has existed for 11 years without being previously reported. I
|
||
conclude, then that the exposure from this problem is not great.
|
||
|
||
Why would you seek to the end of a file using a read=only file
|
||
descriptor anyway? Only one reason I can think of: To get the
|
||
size of the file. The alternative (and much more efficient) way
|
||
to do that is via stat().
|
||
|
||
o Graphics Subsystem (graphics/)
|
||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||
|
||
See also the NxWidgets TODO list file for related issues.
|
||
|
||
Title: UNTESTED GRAPHICS APIS
|
||
Description: Testing of all APIs is not complete. See
|
||
http://nuttx.sourceforge.net/NXGraphicsSubsystem.html#testcoverage
|
||
Status: Open
|
||
Priority: Medium
|
||
|
||
Title: ITALIC FONTS / NEGATIVE FONT OFFSETS
|
||
Description: Font metric structure (in include/nuttx/nx/nxfont.h) should allow
|
||
negative X offsets. Negative x-offsets are necessary for certain
|
||
glyphs (and is very common in italic fonts).
|
||
For example Eth, icircumflex, idieresis, and oslash should have
|
||
offset=1 in the 40x49b font (these missing negative offsets are
|
||
NOTE'ed in the font header files).
|
||
Status: Open. The problem is that the x-offset is an unsigned bitfield
|
||
in the current structure.
|
||
Priority: Low.
|
||
|
||
Title: RAW WINDOW AUTORAISE
|
||
Description: Auto-raise only applies to NXTK windows. Shouldn't it also apply
|
||
to raw windows as well?
|
||
Status: Open
|
||
Priority: Low
|
||
|
||
Title: AUTO-RAISE DISABLED
|
||
Description: Auto-raise is currently disabled. The reason is complex:
|
||
- Most touchscreen controls send touch data a high rates
|
||
- In multi-server mode, touch events get queued in a message
|
||
queue.
|
||
- The logic that receives the messages performs the auto-raise.
|
||
But it can do stupid things after the first auto-raise as
|
||
it operates on the stale data in the message queue.
|
||
I am thinking that auto-raise ought to be removed from NuttX
|
||
and moved out into a graphics layer (like NxWM) that knows
|
||
more about the appropriate context to do the autoraise.
|
||
Status: Open
|
||
Priority: Medium low
|
||
|
||
Title: NxTERM VT100 SUPPORT
|
||
Description: If the NxTerm will be used with the Emacs-like command line
|
||
editor (CLE), then it will need to support VT100 cursor control
|
||
commands.
|
||
Status: Open
|
||
Priority: Low, the need has not yet arisen.
|
||
|
||
Title: VERTICAL ANTI-ALIASING
|
||
Description: Anti-aliasing is implemented along the horizontal raster line
|
||
with fractional pixels at the ends of each line. There is no
|
||
accounting for fractional pixels in the vertical direction.
|
||
As a result lines closer to vertical receive better anti-
|
||
aliasing than lines closer to horizontal.
|
||
Status: Open
|
||
Priority: Low, not a serious issue but worth noting. There is no plan
|
||
to change this behavior.
|
||
|
||
Title: WIDE-FONT SUPPORT
|
||
Description: Wide fonts are not currently supported by the NuttX graphics sub-
|
||
system.
|
||
Status: Open
|
||
Priority: Low for many, but I imagine higher in countries that use wide fonts
|
||
|
||
Title: LOW-RES FRAMEBUFFER RENDERING
|
||
Description: There are obvious issues in the low-res, < 8 BPP, implementation of
|
||
the framebuffer rendering logic of graphics/nxglib/fb. I see two
|
||
obvious problems in reviewing nxglib_copyrectangle():
|
||
|
||
1. The masking logic might work 1 BPP, but is insufficient for other
|
||
resolutions like 2-BPP and 4-BPP.
|
||
2. The use of lnlen will not handle multiple bits per pixel. It
|
||
would need to be converted to a byte count.
|
||
|
||
The function PDC_copy_glyph() in the file apps/graphics/pdcurs34/nuttx/pdcdisp.c
|
||
derives from nxglib_copyrectangle() and all of those issues have been
|
||
resolved in that file.
|
||
|
||
Other framebuffer rendering functions probably have similar issues.
|
||
Status: Open
|
||
Priority: Low. It is not surprising that there would be bugs in this logic:
|
||
I have never encountered a hardware framebuffer with sub-byte pixel
|
||
depth. If such a beast ever shows up, then this priority would be
|
||
higher.
|
||
|
||
Title: INCOMPLATE PLANAR COLOR SUPPORT
|
||
Description: The original NX design included support for planar colors,
|
||
i.e,. for devices that provide separate framebuffers for each
|
||
color component. Planar graphics hard was common some years
|
||
back but is rarely encountered today. In fact, I am not aware
|
||
of any MCU that implements planar framebuffers.
|
||
|
||
Support for planar colors is, however, unverified and
|
||
incomplete. In fact, many recent changes explicitly assume a
|
||
single color plane: Planar colors are specified by a array
|
||
of components; some recent logic uses only component [0],
|
||
ignoring the possible existence of other color component frames.
|
||
|
||
Completely removing planar color support is one reasonable
|
||
options; it is not likely that NuttX will encounter planar
|
||
color hardware and this would greatly simplify the logic and
|
||
eliminate inconsistencies in the immplementation.
|
||
Status: Open
|
||
Priority: Low. There is no problem other than one of aesthetics.
|
||
|
||
o Build system
|
||
^^^^^^^^^^^^
|
||
|
||
Title: MAKE EXPORT LIMITATIONS
|
||
Description: The top-level Makefile 'export' target that will bundle up all of the
|
||
NuttX libraries, header files, and the startup object into an export-able
|
||
tarball. This target uses the tools/mkexport.sh script. Issues:
|
||
|
||
1. This script assumes the host archiver ar may not be appropriate for
|
||
non-GCC toolchains
|
||
2. For the kernel build, the user libraries should be built into some
|
||
libuser.a. The list of user libraries would have to accepted with
|
||
some new argument, perhaps -u.
|
||
Status: Open
|
||
Priority: Low.
|
||
|
||
o Other drivers (drivers/)
|
||
^^^^^^^^^^^^^^^^^^^^^^^^
|
||
|
||
Title: SYSLOG OUTPUT LOST ON A CRASH
|
||
Description: Flush syslog output on crash. I don't know how to do in the
|
||
character driver case with interrupts disabled. It would be
|
||
easy to flush the interrupt interrupt buffer, but not the
|
||
data buffered within a character driver (such as the serial
|
||
driver).
|
||
|
||
Perhaps there could be a crash dump IOCTL command to flush
|
||
that buffered data with interrupts disabled?
|
||
Status: Open
|
||
Priority: Low. It would be a convenience and would simplify crash
|
||
debug if you could see all of the SYSLOG output up to the
|
||
time of the crash. But not essential.
|
||
|
||
Title: SERIAL DRIVER WITH DMA DOES NOT DISCARD OOB CHARACTERS
|
||
Description: If Ctrl-Z or Ctrl-C actions are enabled, the the OOB
|
||
character that generates the signal action must not be placed
|
||
in the serial driver Rx buffer. This behavior is correct for
|
||
the non-DMA case (serial_io.c), but not for the DMA case
|
||
(serial_dma.c). In the DMA case, the OOB character is left
|
||
in the Rx buffer and will be received as normal Rx data by
|
||
the application. It should not work that way.
|
||
|
||
Perhaps in the DMA case, the OOB characters could be filtered
|
||
out later, just before returning the Rx data to the application?
|
||
Status: Open
|
||
Priority: Low, provided that the application can handle these characters
|
||
in the data stream.
|
||
|
||
o Linux/Cygwin simulation (arch/sim)
|
||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||
|
||
Title: SIMULATOR HAS NO INTERRUPTS (NON-PREMPTIBLE)
|
||
Description: The current simulator implementation is has no interrupts and, hence,
|
||
is non-preemptible. Also, without simulated interrupt, there can
|
||
be no high-fidelity simulated device drivers.
|
||
|
||
Currently, all timing and serial input is simulated in the IDLE loop:
|
||
When nothing is going on in the simulation, the IDLE loop runs and
|
||
fakes timer and UART events.
|
||
Status: Open
|
||
Priority: Low, unless there is a need for developing a higher fidelity simulation
|
||
I have been thinking about how to implement simulated interrupts in
|
||
the simulation. I think a solution would work like this:
|
||
https://cwiki.apache.org/confluence/display/NUTTX/NuttX+Simulation
|
||
|
||
Title: ROUND-ROBIN SCHEDULING IN THE SIMULATOR
|
||
Description: Since the simulation is not pre-emptible, you can't use round-robin
|
||
scheduling (no time slicing). Currently, the timer interrupts are
|
||
"faked" during IDLE loop processing and, as a result, there is no
|
||
task pre-emption because there are no asynchronous events. This could
|
||
probably be fixed if the "timer interrupt" were driver by Linux
|
||
signals. NOTE: You would also have to implement up_irq_save() and
|
||
up_irq_restore() to block and (conditionally) unblock the signal.
|
||
Status: Open
|
||
Priority: Low
|
||
|
||
o ARM (arch/arm/)
|
||
^^^^^^^^^^^^^^^
|
||
|
||
Title: IMPROVED ARM INTERRUPT HANDLING
|
||
Description: ARM interrupt handling performance could be improved in some
|
||
ways. One easy way is to use a pointer to the context save
|
||
area in g_current_regs instead of using up_copystate so much.
|
||
|
||
This approach is already implemented for the ARM Cortex-M0,
|
||
Cortex-M3, Cortex-M4, and Cortex-A5 families. But still needs
|
||
to be back-ported to the ARM7 and ARM9 (which are nearly
|
||
identical to the Cortex-A5 in this regard). The change is
|
||
*very* simple for this architecture, but not implemented.
|
||
Status: Open. But complete on all ARM platforms except ARM7 and ARM9.
|
||
Priority: Low.
|
||
|
||
Title: IMPROVED ARM INTERRUPT HANDLING
|
||
Description: The ARM and Cortex-M3 interrupt handlers restores all registers
|
||
upon return. This could be improved as well: If there is no
|
||
context switch, then the static registers need not be restored
|
||
because they will not be modified by the called C code.
|
||
(see arch/renesas/src/sh1/sh1_vector.S for example)
|
||
Status: Open
|
||
Priority: Low
|
||
|
||
Title: CORTEX-M3 STACK OVERFLOW
|
||
Description: There is bit bit logic in up_fullcontextrestore() that executes on
|
||
return from interrupts (and other context switches) that looks like:
|
||
|
||
ldr r1, [r0, #(4*REG_CPSR)] /* Fetch the stored CPSR value */
|
||
msr cpsr, r1 /* Set the CPSR */
|
||
|
||
/* Now recover r0 and r1 */
|
||
|
||
ldr r0, [sp]
|
||
ldr r1, [sp, #4]
|
||
add sp, sp, #(2*4)
|
||
|
||
/* Then return to the address at the stop of the stack,
|
||
* destroying the stack frame
|
||
*/
|
||
|
||
ldr pc, [sp], #4
|
||
|
||
Under conditions of excessively high interrupt conditions, many
|
||
nested interrupts can occur just after the 'msr cpsr' instruction.
|
||
At that time, there are 4 bytes on the stack and, with each
|
||
interrupt, the stack pointer may increment and possibly overflow.
|
||
|
||
This can happen only under conditions of continuous interrupts.
|
||
One suggested change is:
|
||
|
||
ldr r1, [r0, #(4*REG_CPSR)] /* Fetch the stored CPSR value */
|
||
msr spsr_cxsf, r1 /* Set the CPSR */
|
||
ldmia r0, {r0-r15}^
|
||
|
||
But this has not been proven to be a solution.
|
||
|
||
UPDATE: Other ARM architectures have a similar issue.
|
||
|
||
Status: Open
|
||
Priority: Low. The conditions of continuous interrupts is really the problem.
|
||
If your design needs continuous interrupts like this, please try
|
||
the above change and, please, submit a patch with the working fix.
|
||
|
||
Title: IMPROVED TASK START-UP AND SYSCALL RETURN
|
||
Description: Couldn't up_start_task and up_start_pthread syscalls be
|
||
eliminated. Wouldn't this work to get us from kernel-
|
||
to user-mode with a system trap:
|
||
|
||
lda r13, #address
|
||
str rn, [r13]
|
||
msr spsr_SVC, rm
|
||
ld r13,{r15}^
|
||
|
||
Would also need to set r13_USER and r14_USER. For new
|
||
SYS_context_switch... couldn't we do he same thing?
|
||
|
||
Also... System calls use traps to get from user- to kernel-
|
||
mode to perform OS services. That is necessary to get from
|
||
user- to kernel-mode. But then another trap is used to get
|
||
from kernel- back to user-mode. It seems like this second
|
||
trap should be unnecessary. We should be able to do the
|
||
same kind of logic to do this.
|
||
Status: Open
|
||
Priority: Low-ish, but a good opportunity for performance improvement.
|
||
|
||
Title: USE COMMON VECTOR LOGIC IN ALL ARM ARCHITECTURES.
|
||
Description: Originally, each ARMv7-M MCU architecture had its own
|
||
private implementation for interrupt vectors and interrupt
|
||
handling logic. This was superseded by common interrupt
|
||
vector logic but these private implementations were never
|
||
removed from older MCU architectures. This is turning into
|
||
a maintenance issue because any improvements to the common
|
||
vector handling must also be re-implemented for each of the
|
||
older MCU architectures.
|
||
Status: Open
|
||
Priority: Low. A pain in the ass and an annoying implementation, but
|
||
not really an issue otherwise.
|
||
|
||
o Network Utilities (apps/netutils/)
|
||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||
|
||
Title: UNVERIFIED THTTPD FEATURES
|
||
Description: Not all THTTPD features/options have been verified. In
|
||
particular, there is no test case of a CGI program receiving
|
||
POST input. Only the configuration of apps/examples/thttpd
|
||
has been tested.
|
||
Status: Open
|
||
Priority: Medium
|
||
|
||
o NuttShell (NSH) (apps/nshlib)
|
||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||
|
||
See some NHS issues under "Kernel/Protected Build" as well.
|
||
|
||
Title: IFCONFIG AND MULTIPLE NETWORK INTERFACES
|
||
Description: The ifconfig command will not behave correctly if an interface
|
||
is provided and there are multiple interfaces. It should only
|
||
show status for the single interface on the command line; it will
|
||
still show status for all interfaces.
|
||
Status: Open
|
||
Priority: Low
|
||
|
||
o System libraries apps/system (apps/system)
|
||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||
|
||
Title: READLINE IMPLEMENTATION
|
||
Description: readline implementation does not use C-buffered I/O, but rather
|
||
talks to serial driver directly via read(). It includes VT-100
|
||
specific editing commands. A more generic readline() should be
|
||
implemented using termios' tcsetattr() to put the serial driver
|
||
into a "raw" mode.
|
||
Status: Open
|
||
Priority: Low (unless you are using mixed C-buffered I/O with readline and
|
||
fgetc, for example).
|
||
|
||
Title: apps/system PARTITIONING
|
||
Description: Several of the USB device helper applications in apps/system
|
||
violate OS/application partitioning and will fail on a kernel
|
||
or protected build. Many of these have been fixed by adding
|
||
the BOARDIOC_USBDEV_CONTROL boardctl() command. But there are
|
||
still issues.
|
||
|
||
These functions still call directly into operating system
|
||
functions:
|
||
|
||
- usbmsc_configure - Called from apps/system/usbmsc and
|
||
apps/system/composite
|
||
- usbmsc_bindlun - Called from apps/system/usbmsc
|
||
- usbmsc_exportluns - Called from apps/system/usbmsc.
|
||
|
||
Status: Open
|
||
Priority: Medium/High -- the kernel build configuration is not fully fielded
|
||
yet.
|
||
|
||
o Modbus (apps/modbus)
|
||
^^^^^^^^^^^^^^^^^^^^
|
||
|
||
Title: MODBUS NOT USABLE WITH USB SERIAL
|
||
Description: Modbus can be used with USB serial, however, if the USB
|
||
serial connection is lost, Modbus will hang in an infinite
|
||
loop.
|
||
|
||
This is a problem in the handling of select() and read()
|
||
and could probably resolved by studying the Modbus error
|
||
handling.
|
||
|
||
A more USB-friendly solution would be to: (1) Re-connect and
|
||
(2) re-open the serial drivers. That is what is done is NSH.
|
||
When the serial USB device is removed, this terminates the
|
||
session and NSH will then try to re-open the USB device. See
|
||
the function nsh_waitusbready() in the file
|
||
apps/nshlib/nsh_usbconsole.c. When the USB serial is
|
||
reconnected the open() in the function will succeed and a new
|
||
session will be started.
|
||
Status: Open
|
||
Priority: Low. This is really an enhancement request: Modbus was never
|
||
designed to work with removable serial devices.
|
||
|
||
o Other Applications & Tests (apps/examples/)
|
||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||
|
||
Title: EXAMPLES/PIPE ON CYGWIN
|
||
Description: The redirection test (part of examples/pipe) terminates
|
||
incorrectly on the Cygwin-based simulation platform (but works
|
||
fine on the Linux-based simulation platform).
|
||
Status: Open
|
||
Priority: Low
|
||
|
||
Title: EXAMPLES/SENDMAIL UNTESTED
|
||
Description: examples/sendmail is untested on the target (it has been tested
|
||
on the host, but not on the target).
|
||
Status: Open
|
||
Priority: Med
|
||
|
||
Title: EXAMPLES/NX FONT CACHING
|
||
Description: The font caching logic in examples/nx is incomplete. Fonts are
|
||
added to the cache, but never removed. When the cache is full
|
||
it stops rendering. This is not a problem for the examples/nx
|
||
code because it uses so few fonts, but if the logic were
|
||
leveraged for more general purposes, it would be a problem.
|
||
|
||
Update: see examples/nxtext for some improved font cache handling.
|
||
Update: The NXTERM font cache has been generalized and is now
|
||
offered as the standard, common font cache for all applications.
|
||
both the nx and nxtext examples should be modified to use this
|
||
common font cache. See interfaces defined in nxfonts.h.
|
||
Status: Open
|
||
Priority: Low. This is not really a problem because examples/nx works
|
||
fine with its bogus font caching.
|
||
|
||
Title: EXAMPLES/NXTEXT ARTIFACTS
|
||
Description: examples/nxtext. Artifacts when the pop-up window is opened.
|
||
There are some artifacts that appear in the upper left hand
|
||
corner. These seems to be related to window creation. At
|
||
tiny artifact would not be surprising (the initial window
|
||
should like at (0,0) and be of size (1,1)), but sometimes
|
||
the artifact is larger.
|
||
Status: Open
|
||
Priority: Medium.
|
||
|
||
Title: ILLEGAL CALLS TO romdisk_register()
|
||
Description: Several examples (and other things under apps/) make illegal
|
||
calls to romdisk_register(). This both violates the portable
|
||
POSIX OS interface and makes these applications un-usable in
|
||
PROTECTED and KERNEL build modes.
|
||
|
||
Non-compliant examples include:
|
||
|
||
examples/bastest, examples/elf, examples/module,
|
||
examples/nxflat, examples/posix_spawn, examples/romfs,
|
||
examples/sotest, examples/thttpd, examples/unionfs
|
||
|
||
These examples are simple demos and, hence, you could argue that
|
||
it is not so bad that they violate the interface for the purpose
|
||
of demonstration (although they do set a bad example because of
|
||
this).
|
||
|
||
These examples should, of course, use boardctl(BOARDIOC_ROMDISK)
|
||
to create the ROM disk instead of calling romdisk_register()
|
||
directly.
|
||
Status: Open
|
||
Priority: Medium.
|